Skip to content

InitialBug/FastBiDAF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastBiDAF

A pytorch implemention of BiDAF based on CNN
The CNN module is GLDR mostly based on FAST READING COMPREHENSION WITH CONVNETS
The model encoder block is based on QAnet

Difference from the paper

  1. This reposity is a combination of QAnet and GLDR. I did this because QAnet need more memory with multihead attention.
  2. I use beam search in the evaluate process with a beam size 5 instead of traveling all the probabilities.

Usage

  1. Run download.sh to download the SQuAD dataset and GLOVE word embeddings.
  2. Run python config.py --mode preprocess to preprocess the data and start the first time training process.
  3. Run python config.py --mode train to train the model or python config.py --mode train --model modelname to finetune a model.(eg. python config.py --mode train --model mode_final.pkl)
  4. Run python config.py --mode dev --model modelname to evaluate the model and the answer file will be stored. Because this process is same as the test, I didn't duplicate the test() function.

Performance

  1. The model runs fast and will have a good result after 3 hours.(TiTan XP 12GB memory)
  2. The best score I test is F1 74 on the dev set without any finetuning. The most hyperparameters are referred from other models, I don't know whether it's good enough.
Model EM F1
QAnet 73.6 82.7
GLDR 68.2 77.2
BiDAF 67.7 77.3
FastBiDAF 63.7 74.3

Contributions

  1. Welcome to test my code and report your performance. If you have enough time, finetuing the model(dropout, conv layer number, etc.) is a good choice to get better results.

About

A pytorch implementation of BiDAF based on CNN

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published