Please refer to the manuscript for a detailed description of the method and results.
We have released all models used in the manuscript, and highlighted the versions used in human evaluation.
Please don't hesitate in reaching out if you are attempting to use Referee as a baseline and face any issues in running these models!
Huggingface API URL | |
---|---|
Referee-Distill Iteration 1 | msclar/referee-distill_iter-1 |
Referee-Distill Iteration 2 | msclar/referee-distill_iter-2 |
Referee-Distill Iteration 3 (Checkpoint used in human evaluation) |
msclar/referee-distill_iter-3 |
Referee-Distill with context filter: iteratively generating shorter summaries (Information Bottleneck filter)
Huggingface API URL | |
---|---|
Referee-Distill (with context filter) Iteration 1 | msclar/referee-distill-with-context-filter_iter-1 |
Referee-Distill (with context filter) Iteration 2 | msclar/referee-distill-with-context-filter_iter-2 |
Referee-Distill (with context filter) Iteration 3 | msclar/referee-distill-with-context-filter_iter-3 |
Huggingface API URL | |
---|---|
Referee-Control Iteration 1 | msclar/referee-control_iter-1 |
Referee-Control Iteration 2 | msclar/referee-control_iter-2 |
Referee-Control Iteration 3 (Checkpoint used in human evaluation) |
msclar/referee-control_iter-3 |
Referee-Control Iteration 4 | msclar/referee-control_iter-4 |
Referee-Control Iteration 5 | msclar/referee-control_iter-5 |
Referee-Control Iteration 6 | msclar/referee-control_iter-6 |
Referee-Control Iteration 6 | msclar/referee-control_iter-7 |
We released all the generated data from the models released above. Refer to data/README.md
.
- The data released for Referee-Distill step
i
is exactly the data used in training Referee-Distill stepi+1
. - The data released for Referee-Control step
i
is exactly the data used in training Referee-Control stepi+1
.
We are very thankful to OpenAI for the access to their API. Following the API usage guidelines, GPT-Instruct Curie generations are not released publicly.
Warning: all generated summaries are of RealNews sentences. Some RealNews articles may reflect biased or discriminatory views with which the authors do not agree.
This section is only relevant if you are trying to train your own Referee model. Otherwise, please use the Huggingface API directly (see links above).
All training scripts may be found in src/*.sh
. Finetuning and generating scripts are separate to be able to recover quickly in case of server issues. Modify accordingly to your use case if needed!
Metrics scripts: to be uploaded soon!
If you used this code for your experiments or found it helpful, consider citing the following paper:
@inproceedings{sclar2022reference,
title = "Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation",
author = "Sclar, Melanie and
West, Peter and
Kumar, Sachin and
Tsvetkov, Yulia and
Choi, Yejin",
booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing", month = nov,
year = "2022",
address = "Abu Dhabi, UAE",
publisher = "Association for Computational Linguistics",
}