Implementation of Dynamic Memory Networks for Visual and Textual Question Answering on the bAbI question answering tasks using Tensorflow.
- Python 3.x
- Tensorflow 0.8+
- Numpy
- tqdm - Progress bar module
First, You need to install dependencies.
sudo pip install tqdm
git clone https://github.com/therne/dmn-tensorflow & cd dmn-tensorflow
Then download the dataset:
mkdir data
curl -O http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz
tar -xzf tasks_1-20_v1-2.tar.gz -C data/
If you want to run original DMN (models/old/dmn.py
), you also need to download GloVe word embedding data.
curl -O http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip -d data/glove/
./main.py --task [bAbi Task Number]
./main.py --test --task [Task Number]
Trained 20 times and picked best results - using DMN+ model trained with paper settings (Batch 128, 3 episodes, 80 hidden, L2) + batch normalization. The skipped tasks achieved 0 error.
Task | Error Rate |
---|
- Two supporting facts | 25.1%
- Three supporting facts | (N/A)
- Three arguments relations | 1.1%
- Compound coreference | 1.5%
- Time reasoning | 0.8%
- Basic induction | 52.3%
- Positional reasoning | 13.1%
- Size reasoning | 6.1%
- Path finding | 3.5% Average | 5.1%
Overfitting occurs in some tasks and error rate is higher than the paper's result. I think we need some additional regularizations.
- Implementing Dynamic memory networks by YerevaNN - Great article that helped me a lot
- Dynamic-memory-networks-in-Theano
- More regularizations and hyperparameter tuning
- Visual question answering
- Attention visualization
- Interactive mode?