This repository contains the official code for the paper: "Improving and Simplifying Pattern Exploiting Training".
The model improves and simplifies PET with a decoupled label objective and label-conditioned MLM objective.
Decoupled Label Loss Label Conditioned Masked Language Modelling
Setup environment by running source bin/init.sh
. This will
- Download the FewGLUE and SuperGLUE datasets in
data/fewglue/{task}
anddata/superglue/{task}
respectively. - Install and setup environment with correct dependencies.
First, create a config JSON file with the necessary hyperparameters. For reference, please see config/BoolQ.json
.
Then, to train the model, run the following commands:
sh bin/setup.sh
sh bin/train.sh {config_file}
The output will be in the experiment directory exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/
. Once the model has been trained, the following files can be found in the directory:
exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/
|
|__ best_model.pt
|__ dev_scores.json
|__ config.json
|__ dev_logits.npy
|__ src
To aid reproducibility, we provide the JSON files to replicate the paper's results at config/{task_name}.json
.
To evaluate the model on the SuperGLUE dev set, run the following command:
sh bin/dev.sh exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/
The dev scores can be found in exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/dev_scores.json
.
To evaluate the model on the SuperGLUE test set, run the following command.
sh bin/test.sh exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/
The generated predictions can be found in exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/test.json
.
Our fine-tuned models can be found in this link.
To evaluate these fine-tuned models for different tasks, run the following command:
python src/run_pretrained.py -m {finetuned_model_dir}/{task_name} -c config/{task_name}.json -k pattern={best_pattern_for_task}
The scores can be found in exp_out/fewglue/{task_name}/albert-xxlarge-v2/{timestamp}/dev_scores.json
.
Note: The best_pattern_for_task
can be found in Table 4 of the paper.
For any doubts or questions regarding the work, please contact Derek (dtredsox@cs.unc.edu) or Rakesh (rrmenon@cs.unc.edu). For any bug or issues with the code, feel free to open a GitHub issue or pull request.
Please cite us if ADAPET is useful in your work:
@article={tam2021improving,
title={Improving and Simplifying Pattern Exploiting Training},
author={Tam, Derek and Menon, Rakesh R and Bansal, Mohit and Srivastava, Shashank and Raffel, Colin},
journal={arxiv preprint arXiv:2103.11955},
year={2021}
}
source env/bin/activate.fish
export PET_ELECTRA_ROOT=(pwd)
python -m pdb -c continue -m src.train -c config/DO.json
python -m src.test -e exp_out/dosentencepairs/albert-base-v2/2021-08-10-13-34-34/
Results are then in exp_out/dosentencepairs/albert-base-v2/2021-08-10-13-34-34/test.json