Skip to content

Clarification on reproducing results (Table 1, Table 4, Table A7) and using LAVE evaluator #1

@DailyVy

Description

@DailyVy

Hello, thank you for releasing your impressive work!

I'm currently trying to reproduce the results reported in Table 1, Table 4, and Table A7 of the paper.
However, I found that the repository only provides the inference.py script, and there are no detailed instructions for evaluating on benchmark datasets.

I saw that LAVE is mentioned as the evaluator, and I checked its README, but I'm still unsure about how to properly run the evaluation.
In particular, I’d like to ask:

  1. How should the evaluation process be performed with LAVE to reproduce your reported results?
  2. How is the auto-vocabulary.json file generated or saved before running lave.py?
  3. Are there any specific configurations or dataset formats required for evaluation?

Any brief guideline or example command would be greatly appreciated.
Thank you very much for your time and support!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions