This repo contains the source code of the models described in the following paper
- "Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks" in Proceedings of EMNLP, 2021. (paper).
The basic code structure was adapted from the HuggingFace Transformers.
- PyTorch 1.4.0/1.6.0/1.8.0
- Pytorch Lightning 0.7.6. Support for more recent versions of lightning coming up soon.
- Download the original SCAN data
- Download the SCAN MCD splits
- Organize the data into
data/scan
and make sure it follows such a structure:
------ data
--------- scan
------------ tasks_test_mcd1.txt
------------ tasks_train_mcd1.txt
------------ tasks_val_mcd1.txt
- Train the model on the SCAN MCD1 splits by running:
./train_scan_scripts/train_auxseq_mcd1.sh
- By defaults, the top-5 best model checkpoints will be saved in
out/scan/auxseq-00
.
- Set the
EVAL_EPOCH
parameter in theeval_scan_scripts/eval_auxseq_mcd1.sh
. - Evaluate the model on the SCAN MCD1 splits by running:
./eval_scan_scripts/eval_auxseq_mcd1.sh
@inproceedings{jiang-bansal-2021-enriching,
title = "Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks",
author = "Jiang, Yichen and Bansal, Mohit",
booktitle = "Proceedings of the EMNLP 2021",
year = "2021",
}