CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

This repository contains the data and code for the baseline described in the following paper:

CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Yasumasa Onoe, Michael J.Q. Zhang, Eunsol Choi, Greg Durrett
NeurIPS 2021 Datasets and Benchmarks Track

@article{onoe2021creak,
  title={CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge},
  author={Onoe, Yasumasa and Zhang, Michael J.Q. and Choi, Eunsol and Durrett, Greg},
  journal={OpenReview},
  year={2021}
}

***** [New] November 8th, 2021: The contrast set has been updated. *****

We have increased the size of the contrast set to 500 examples. Please check the paper for new numbers.

Datasets

Examples

Data Files

CREAK data files are located under data/creak.

train.json contains 10,176 training examples.
dev.json contains 1,371 development examples.
test_without_labels.json contains 1,371 test examples (labels are not included).
contrast_set.json contains 500 contrastive examples.

The data files are formatted as jsonlines. Here is a single training example:

{
    'ex_id': 'train_1423',
    'sentence': 'Lauryn Hill separates two valleys as it is located between them.',
    'explanation': 'Lauren Hill is actually a person and not a mountain.',
    'label': 'false',
    'entity': 'Lauryn Hill',
    'en_wiki_pageid': '162864',
    'entity_mention_loc': [[0, 11]]
}

Field	Description
`ex_id`	Example ID
`sentence`	Claim
`explanation`	Explanation by the annotator why the claim is TRUE/FALSE
`label`	Label: 'true' or 'false'
`entity`	Seed entity
`en_wiki_pageid`	English Wikipedia Page ID for the seed entity
`entity_mention_loc`	Location(s) of the seed entity in the claim

Baselines

See this README

Leaderboards

https://www.cs.utexas.edu/~yasumasa/creak/leaderboard.html

We host results only for Closed-Book methods that have been finetuned on only In-Domain data.

To submit your results, please send your system name and prediction files for the dev, test, and contrast sets to yasumasa@utexas.edu.

Contact

Please contact at yasumasa@utexas.edu if you have any questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

Datasets

Examples

Data Files

Baselines

Leaderboards

Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

Datasets

Examples

Data Files

Baselines

Leaderboards

Contact