FastBiDAF

A pytorch implemention of BiDAF based on CNN
The CNN module is GLDR mostly based on FAST READING COMPREHENSION WITH CONVNETS
The model encoder block is based on QAnet

Difference from the paper

This reposity is a combination of QAnet and GLDR. I did this because QAnet need more memory with multihead attention.
I use beam search in the evaluate process with a beam size 5 instead of traveling all the probabilities.

Run download.sh to download the SQuAD dataset and GLOVE word embeddings.
Run python config.py --mode preprocess to preprocess the data and start the first time training process.
Run python config.py --mode train to train the model or python config.py --mode train --model modelname to finetune a model.(eg. python config.py --mode train --model mode_final.pkl)
Run python config.py --mode dev --model modelname to evaluate the model and the answer file will be stored. Because this process is same as the test, I didn't duplicate the test() function.

The model runs fast and will have a good result after 3 hours.(TiTan XP 12GB memory)
The best score I test is F1 74 on the dev set without any finetuning. The most hyperparameters are referred from other models, I don't know whether it's good enough.

Welcome to test my code and report your performance. If you have enough time, finetuing the model(dropout, conv layer number, etc.) is a good choice to get better results.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
config.py		config.py
download.sh		download.sh
evaluation.py		evaluation.py
main.py		main.py
model.py		model.py
preproc.py		preproc.py
test.py		test.py