The data and code for the paper AR-LSAT: Investigating Analytical Reasoning of Text, and complete data for the paper From LSAT: The Progress and Challenges of Complex Reasoning.
If you find this paper or this code useful, please cite this paper:
@misc{zhong2021arlsat,
title={AR-LSAT: Investigating Analytical Reasoning of Text},
author={Wanjun Zhong and Siyuan Wang and Duyu Tang and Zenan Xu and Daya Guo and Jiahai Wang and Jian Yin and Ming Zhou and Nan Duan},
year={2021},
eprint={2104.06598},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{wang2022lsat,
title={From lsat: The progress and challenges of complex reasoning},
author={Wang, Siyuan and Liu, Zhongkun and Zhong, Wanjun and Zhou, Ming and Wei, Zhongyu and Chen, Zhumin and Duan, Nan},
journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
year={2022},
publisher={IEEE}
}
[
{
"id": ....
"passage": ...
"questions": [
{
"id": ...
"fatherId": ...
"question": ...
"options": ...
"answer": ...
}
}
]
cd Transformer-based Model
bash run_roberta_large.sh
Note:
- you need to modify the file name in utils_multiple_choice.py
- you can change different backbone by modifying the --model_name_or_path in the run_roberta_large.sh script
- for running the LSTM based baseline, pls refer to the same steps
- Step 1: extract named entity recognition (NER), Constinuency Parsing (CP) and Dependency Parsing (DP) results from the original files:
- Step 2: extract participants, positions from the context
- Step 3: run pipeline for the dev and test set.
1.
cd data_analysis
python extract_cp_ner_dp_results.py
2.
cd data_analysis
python extract_participant_modify_context_preprocessed.py
3.
cd pipeline
python nl2fact_fule.py
Normally, this pipeline will get the precision:
Data | Accuracy |
---|---|
Development | 34.2 |
Test | 30.9 |