Human ratings for the paper "Event knowledge in large language models: the gap between the impossible and the unlikely".
The main analysis repo: https://github.com/carina-kauf/lm-event-knowledge
Dataset 1 - EventsAdapt (based on Fedorenko et al, 2020)
Dataset 2 - DTFit (based on Vassallo et al, 2018)
Dataset 3 - EventsRev (based on Ivanova et al, 2021)
Human ratings for dataset 2 had been collected previously by Vassallo et al; human ratings for datasets 1 and 3 were collected specifically for this study. The final human ratings spreadsheet for these datasts can be found at Events***/analyses/longform_data.csv
Turkolizer (Gibson et al, 2011) scripts were used to sort sentences into lists, such that each participant saw no more than one variant of each sentence item (plausible/implausible, active/passive).