Code for EMNLP 2021 paper "CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization"
- Codes for using unlikelihood training and in-batch negatives are added. Please check train_xsum_batch_neg.sh and train_xsum_single_neg_ull.sh. Related Fairseq codes are here: unlikelihood_translation.py and contrastive_translation_batch_neg.py.
- A cleaner implementation is available. The new implementation uses less system RAM and is compatible with the current version of Fairseq. Check here.
- We find that the newer version of QuestEval produces much lower scores than the version (commit
0e94a74
) we used in our paper. Please do not directly take the QuestEval results from the paper if you are using the newer version.
For data construction, please refer to data_construction. Constructed datasets are also available in Google Drive.
The following scripts require that your $DATA
folder is organized the same as the data
folder
in Google Drive.
Our experiments with BART use Fairseq at commit 0db28cd
. Newer versions might also work.
Please download the pre-trained BART model here
and set BART_PATH
to the downloaded model:
export BART_PATH=/path/to/bart/model.pt
The following command trains the models with negative samples constructed by SysLowCon
.
It saves the trained models in $TRAINED_MODELS/xsum/syslowcon
and $TRAINED_MODELS/cnndm/syslowcon
.
Please change $DATA/xsum_synthetic/negative_syslowcon
to other negative samples to train the corresponding models.
# XSum
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_single_neg.sh \
$DATA/xsum_synthetic/negative_syslowcon $TRAINED_MODELS/bart_xsum/syslowcon
# CNN/DM
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_single_neg.sh \
$DATA/cnndm_synthetic/negative_syslowcon $TRAINED_MODELS/bart_cnndm/syslowcon
The following command trains the models with negative samples constructed by SysLowCon
and SwapEnt
.
It saves the trained models in $TRAINED_MODELS/xsum/syslowcon_swapent
and $TRAINED_MODELS/cnndm/syslowcon_swapent
.
# XSum
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_mutli_neg.sh \
"$DATA/xsum_synthetic/negative_syslowcon $DATA/xsum_synthetic/negative_swapent" \
$TRAINED_MODELS/bart_xsum/syslowcon_swapent
# CNN/DM
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_multi_neg.sh \
"$DATA/cnndm_synthetic/negative_syslowcon $DATA/cnndm_synthetic/negative_swapent" \
$TRAINED_MODELS/bart_cnndm/syslowcon_swapent
Our experiments with Pegasus use Huggingface Transformers 4.5.1
.
Newer versions might also work.
# XSum
cd scripts/pegasus
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_single_neg.sh \
$DATA/xsum_synthetic/negative_syslowcon $TRAINED_MODELS/pegasus_xsum/syslowcon
# CNN/DM
cd scripts/pegasus
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_single_neg.sh \
$DATA/cnndm_synthetic/negative_syslowcon $TRAINED_MODELS/pegasus_cnndm/syslowcon
The following examples show how to decode trained models. Model checkpoints are available in Google Drive.
# XSum
cd scripts/bart
./decode_xsum.sh $TRAINED_MODELS/bart_xsum/syslowcon/checkpoint_last.pt /path/to/save/dir
# CNN/DM
cd scripts/bart
./decode_cnndm.sh $TRAINED_MODELS/bart_cnndm/syslowcon/checkpoint_last.pt /path/to/save/dir
# XSum
cd scripts/pegasus
python run_generation.py $DATA/xsum_raw/test.source $TRAINED_MODELS/pegasus_xsum/syslowcon /path/to/save/dir
# CNN/DM
cd scripts/pegasus
python run_generation.py $DATA/cnndm_raw/test.source $TRAINED_MODELS/pegasus_cnndm/syslowcon /path/to/save/dir