Project 2: Neural Machine Translation models for English-French sentences from Multi30k Task 1 dataset
Train a model using the following command:
python train.py --english_train training/train.en --french_train training/train.fr --english_valid val/val.en --french_valid val/val.fr --enc_type avg --dec_type gru --attention dot --lr 0.0005 --tf_ratio 0.75 --batch_size 32 --epochs 10 --dim 400 --num_symbols 10000 --min_count 1 --max_length 74 [--lower] [--enable_cuda]
We truecased the corpus by default. Set the lower
flag to lowercase. If a GPU is available, set the enable_cuda
flag.
For some arguments there are multiple options:
- Encoder: avg | transformer | gru
- Attention: dot | bilinear | multihead
Test a saved model using the following command:
python3 test.py --english test/test_2017_flickr.en --french test/test_2017_flickr.fr --encoder encoder_type=gru.pt --decoder decoder_type=gru.pt --corpus corpus.pickle --max_length 74 [--enable_cuda] [--transformer]
If testing with different decoders or encoders, replace the path names. If a GPU is available, set the enable_cuda
flag. If testing a transformer conder, set the transformer
flag, which is needed for visualization of the attention weights.
Examples for attention visualizations can be found in the Project2/visualization folder. Our results on Multi30k testing data, En-Fr are the following:
Encoder |
Attention | BLEU | TER | METEOR |
---|---|---|---|---|
Averaging | Multihead | 30.08 | 47.89 | 30.29 |
GRU | Multihead | 33.81 | 44.60 | 32.10 |
Transformer | Multihead | 31.21 | 51.55 | 31.22 |
For the first project we implemented IBM 1 and IBM 2, along with the Variational Bayes and Expectation Maximisation optimisation algorithms.
The models are trained using data from a parallel corpus. Code to read in the data can be found in the file data.py
.
Words occurring once are mapped to the token '-UNK-'.
Code to evaluate alignments found is given in the file aer.py
.
The models themselves are implemented in interactive jupyter notebooks:
IBM1-EM.ipynb
IBM1-VB.ipynb
IBM2-EM.ipynb
IBM2-VB.ipynb
Run the notebooks one by one to retrain the models. Performance on the testing data is presented in the NAACL format in the folder test_results
.
The following image presents an example alignment for IBM1 VB: The following image presents an example alignment for IBM1 EM:
The results on test data were the following:
Model | Training | Selection | AER |
---|---|---|---|
IBM 1 | MLE | AER | 0.2852 |
IBM 1 | VB | AER | 0.2866 |
IBM 1 | MLE | LL | 0.2856 |
IBM 1 | VB | ELBO | 0.2863 |
IBM 2 | MLE | AER | 0.2068 |
IBM 2 | VB | AER | 0.2054 |
IBM 2 | MLE | LL | 0.2047 |
IBM 2 | VB | ELBO | 0.2036 |