Evaluating and Characterizing Human Rationales

This repository contains code for paper https://arxiv.org/abs/2010.04736 appearing in EMNLP2020.

Installation

pip install -r requirements.txt

Structure of the repository

config: dictates the experimentation script
1. data_config set the output directory for the experimentation
2. data_config set the appropriate input directories for all the datasets
3. model_config select among roberta, lstm, random forest or/and logistic regression classifier
4. data_config select among wikiattack, sst, movie, multirc, fever or/and esnli datasets
dataset: Dataset class with super class torch.utils.data.Dataset to create dataloaders and datasets for trainer
fidelity: compute fidelity given predictions or model and input ids (change the code to use nlp fidelity)
model: contains classifiers for experimentation
1. lstm_classifier text --> RoBERTa embedding --> BiLSTM --> Linear
2. roberta_classifier robertaForSequenceClassification
3. sklearn_classifier text --> sklearnTokenizer --> sklearnVectoriser --> sklearnClassifier(random forest or logistic regression)
plotting: dataset_and_fidelity_analysis_plots contains the plotting code for all figures
preliminary_analysis: analyze_datasets and generate_table to analyze all the datasets for mean text length, mean rationale length.
scripts: run_experiment_trainer and run_experiment_sklearn to run experiment on roberta classifier, lstm classifier and sklearn classifier respectively.
train_eval: contains the code for generating data for fidelity curves and to cache prediction to generate plots
util: some utility functions

Hyperparameters

RoBERTA classifier
1. hidden_dropout_prob
LSTM classifier
1. hidden_size
2. pad_packing
Random Forest classifier
1. n_estimators
Logistic Regression classifier
1. C
Training Params
1. learning_rate
2. num_train_epochs
3. weight_decay
4. batch_size

Steps

Add the location of the output directory in the config
Add the location of the corresponding data directories in the config
Choose the model and the data on which analysis is to be performed
Run the experiment using run_experiment_trainer.py or run_experiment_sklearn.py if the model is lstm or roberta and logistic regression or random forest respectively.

CITATION

@inproceedings{carton+rathore+tan:20,
     author = {Samuel Carton and Anirudh Rathore and Chenhao Tan},
     title = {Evaluating and Characterizing Human Rationales},
     year = {2020},
     booktitle = {Proceedings of EMNLP}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating and Characterizing Human Rationales

Installation

Structure of the repository

Hyperparameters

Steps

CITATION

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
config		config
dataset		dataset
fidelity		fidelity
figs/figs		figs/figs
model		model
plotting		plotting
preliminary_analysis		preliminary_analysis
scripts		scripts
train_eval		train_eval
trainer		trainer
util		util
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

BoulderDS/evaluating-human-rationales

Folders and files

Latest commit

History

Repository files navigation

Evaluating and Characterizing Human Rationales

Installation

Structure of the repository

Hyperparameters

Steps

CITATION

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages