Skip to content

Latest commit

 

History

History
11 lines (7 loc) · 603 Bytes

README.md

File metadata and controls

11 lines (7 loc) · 603 Bytes

Code for the EACL 2021 paper How to Evaluate a Summarizer: Study Design and Statistical Analysis for Manual Linguistic Quality Evaluation

Most experiments can be reproduced by running the accompanying jupyter notebook. For significance analysis run scrips/r/analyse-ordinal.r anonymized_judgements/<data_file> crossed

Power analysis is not included in the steps, as it is computationally expensive. To reproduce one step, run

python -m summaryanalysis.design_power -b <batch count> -d <docs per batch> -a <annotators per doc> <model_file> out.csv