Replicating FLARE's results

by Stephen Yin

Github repository to replicate the results from the FLARE paper results

Requirements

This project used Python 3.10.13.

To set up the dependencies, it is recommended to use a virtual environment (conda was used for this project).

Run bash setup/setup.sh
Retrieve your OpenAI API key. Add the following to your ~/.bashrc: export OPENAI_API_KEY="{YOUR_API_KEY_HERE}" (replace the curly braces as well)

Note: The version of sentencepiece was changed from 0.1.83 to 0.1.98 to be compatible with the use of Python 3.10.13.

Setup (Downloads & Retrieval Engine Building)

Download ASQA dataset

Follow the instructions from the ASQA repository. Rename the ASQA dataset to ASQA_full.json and place it in the directory dataset/ASQA_full.json

Then, create the test set by subsampling from the dev split of ASQA:

python setup/select_questions.py

Download Wikipedia dump

Download the Wikipedia dump from the DPR repository using the following command:

mkdir dataset/dpr
wget -O dataset/dpr/psgs_w100.tsv.gz https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
pushd dataset/dpr
gzip -d psgs_w100.tsv.gz
popd

Build the ElasticSearch index

(Instructions taken from beir example)

To be able to run Elasticsearch, you should have it installed locally (on your desktop) along with pip install beir. Depending on your OS, you would be able to find how to download Elasticsearch. I like this guide for Ubuntu 18.04 - https://linuxize.com/post/how-to-install-elasticsearch-on-ubuntu-18-04/

For more details, please refer here - https://www.elastic.co/downloads/elasticsearch.

This code doesn't require GPU to run.

Run the following command to build the ElasticSearch index

python setup/build_index.py --datapath dataset/dpr/psgs_w100.tsv

There are 21,015,325 documents in the wikipedia dump to load.

Run the model to generate results

⚠️WARNING⚠️: Running the model makes many queries to the OpenAI API, which can result in ~$25 per experiment on 500 examples.

Generate the results

Run the following command to generate results on the selected test set

python model/flare.py -d {DATASET (ASQA 500 examples or ASQA_mini 50 examples)} -n {NAME_OF_EXPERIMENT}

The results should be saved in outputs/{NAME_OF_EXPERIMENT}.json The data analysis should be saved in outputs/{NAME_OF_EXPERIMENT-analytics}.json

Evaluate the results

Outputs should be correctly formatted such that one can follow the instructions from the ASQA repo.

Example results for the reimplementation of FLARE_direct with implicit queries:

{
    "rougeLsum": 27.630332229755194, 
    "length": 136.802, 
    "str_em": 40.75, 
    "QA-EM": 18.246666666666663, 
    "QA-F1": 24.25903202141283, 
    "QA-Hit": 2.6, 
    "ovscore": 25.88986508894757
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
analysis		analysis
configs		configs
model		model
outputs		outputs
results/flare-implicit		results/flare-implicit
setup		setup
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replicating FLARE's results

Requirements

Setup (Downloads & Retrieval Engine Building)

Download ASQA dataset

Download Wikipedia dump

Build the ElasticSearch index

Run the model to generate results

Generate the results

Evaluate the results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Replicating FLARE's results

Requirements

Setup (Downloads & Retrieval Engine Building)

Download ASQA dataset

Download Wikipedia dump

Build the ElasticSearch index

Run the model to generate results

Generate the results

Evaluate the results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages