- Training data set is provided by Google's Natural Questions
- Data is cleaned and stored in MongoDB - notebooks: TFQA-bilstm-attn.ipynb
- Model is parallel bi-directional lstm for both question and answer text; the second lstm layer is followed by a multihead attention layer
- The parallel models are concatenated into dense layers with a sigmoid activation at the final layer
- The Spacy large vocabulary model is used for token ID's and vectors for each word
- Model is in notebooks: TFQA-bilstm-attn.ipynb
- Weights from trained model are saved in file which is too large to upload but can be readily regenerated
- A Flask demo app is included with Bootstrap 4 styling:
- Weights from trained model are imported on demo app startup
- Start page has input boxes for question and text corpus.
- Text corpus is parsed into potential one, two and three sentence answers which are fed into the trained model
- The highest three scores are listed as potential answers with scores in [0, 1)
- Named entities are also displayed for the text corpus.
msb1/flask-nlp-question-answer
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|