This repository contains the datasets used in the paper
@inproceedings{gan-ng-2019-improving,
title = "Improving the Robustness of Question Answering Systems to Question Paraphrasing",
author = "Gan, Wee Chung and
Ng, Hwee Tou",
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2019",
}
The datasets are organised according to the original .json format of SQuAD v1.1. Hence, models can be evaluated on these datasets in the exact same manner as the original development set. There are a total of 4 .json files in this repository:
Non-adversarial paraphrased dataset used to evaluate models' over-sensitivity to small paraphrasing in the questions:
dev_para.json
: Dataset containing paraphrased SQuAD questions.dev_orig.json
: Dataset containing the corresponding original SQuAD questions for performance comparison.
Adversarial paraphrased dataset used to evaluate models' over-reliance on string matching to obtain the answer:
adv_para.json
: Dataset containing paraphrased SQuAD questions.adv_orig.json
: Dataset containing the corresponding original SQuAD questions for performance comparison.