6.864 Project

##Abstract

SAT vocabulary questions involve selecting the best word or words out of the choices given to fill in a blank for a block of text.

This project will solve these vocabulary questions by applying the ideas of n-gram models, parsing, and recurrent neural networks in order to correctly rank and identify the best solutions. In particular, we will score each option with our model and select the highest scoring answer as our solution.

##Running the Code

Before running the code, extract the compressed files from dataset. If you get an error while running NGram.py, make sure you have punkt installed (if not, run nltk.download() and install punkt). These are included in the dataset/ directory.

Dependencies

Python3 brew install python3 NumPy pip3 install numpy NLTK pip3 install nltk

n-Gram Model

To run the n-gram model, use the command python3 NGram.py

Word Embeddings Model

To run the word embeddings model, use the command python3 word_embeddings.py

LSTM Model

To run the forward (one-directional) LSTM model, use the command python3 LSTM.py --data_path=. --model test The "model" flag may be any of small, medium, large, or test. We recommend running test (a very small configuration) for time purposes.

To run the LSTM in the backward direction, use the command python3 LSTM.py --data_path=. --modell test --bidirectional True

Running the LSTM model will save the output in two text files, forward_out_.txt or backward_out_.txt.

##Data/Corpora

Our training data consists of text from Wall Street Journal and our test data will be multiple choice fill in the blank questions with answers to measure accuracy.

Link to download corpus: https://www.dropbox.com/s/ttne74p1jsjbzwe/LDC95T7.tgz?dl=0 Link to download word vectors: http://aadah.me/misc/vectors.txt

Corpus folders should go under /dataset and vectors.txt should be in the root folder of the files

##Baselines

We will use multiple baselines to compare our results, such as random answering and average SAT test scores. We will also compare it to previous similar classifiers.

##Evaluation

To measure the performance of our system, we will analyze the accuracy of our model in answering easy, medium, and hard questions. We will also look at top 2 accuracy, where we check if the answer was one of our top two choices. Another consideration will be the speed of our algorithm and whether it works within a reasonable time span with a reasonable amount of resources.

###Report link for project updates/ “lab notebook”

https://docs.google.com/document/d/1ATceKZNPIFECAnmQ0eaU2_6yU07pu4EiijlSCzAv0do/edit

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
dataset		dataset
.gitignore		.gitignore
LSTM.py		LSTM.py
NGram.py		NGram.py
README.md		README.md
Unigram.py		Unigram.py
backwards_out_MSR.txt		backwards_out_MSR.txt
backwards_out_SAT.txt		backwards_out_SAT.txt
bidirectional_LSTM.py		bidirectional_LSTM.py
bidirectional_out_MSR.txt		bidirectional_out_MSR.txt
bidirectional_out_SAT.txt		bidirectional_out_SAT.txt
forward_out_MSR.txt		forward_out_MSR.txt
forward_out_SAT.txt		forward_out_SAT.txt
metrics.py		metrics.py
model.py		model.py
reader.py		reader.py
small_train_output.txt		small_train_output.txt
word_embeddings.py		word_embeddings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

6.864 Project

Dependencies

n-Gram Model

Word Embeddings Model

LSTM Model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

6.864 Project

Dependencies

n-Gram Model

Word Embeddings Model

LSTM Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages