Skip to content
/ leca Public

Code for Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation

License

Notifications You must be signed in to change notification settings

ghchen18/leca

Repository files navigation

Code for paper - Lexical-constrained-aware neural machine translation

Install and Data preprocess

The code is implemented on fairseq v0.6.1, follow the same steps to install and prepare the processed fairseq dataset, the WMT process script is here.

Step 1: Install fairseq.

## you may want to build a conda environment first.
git clone https://github.com/ghchen18/leca.git
cd leca
pip install --editable .

Step 2: Process dataset

Follow the steps in the fairseq repo. More dataset can be found in WMT Translation Task. Because of the difference between the used dictionaries, the data preprocessing should use the preprocess.py in this repo instead of the official fairseq repo.

Run experiment

See scripts/run.sh. You may need to revise the variables in the shell scripts first according to your case.

Citation

@inproceedings{chen2020leca,
  title     = {Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation},
  author    = {Chen, Guanhua and Chen, Yun and Wang, Yong and Li, Victor O.K.},
  booktitle = {Proceedings of {IJCAI} 2020: Main track},          
  pages     = {3587--3593},
  year      = {2020},
  month     = {7},
}

About

Code for Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages