Joint Summarization & Question Generation

This repository contains the code for the ACL 2022 paper "A Feasibility Study of Answer-Agnostic Question Generation for Education". In our paper we show that running QG on summarized text results in higher quality questions.

Installation

Conda:

conda create -n sumqg_env python=3.9.7
conda activate sumqg_env
pip install -r requirements.txt
python -m nltk.downloader punkt

venv:

python -m venv env
source env/bin/activate
pip install -r requirements.txt
python -m nltk.downloader punkt

Usage

To run QG on user input or a file, use run_qg.py. Add the -s flag to include automatic summarization in the pipeline before running QG (for use on longer inputs only). Add the -f flag to use the smaller and faster distilled versions of the models. The full options are listed below.

$ python run_qg.py -h
  -s, --use_summary     Include summarization pre-processing
  -f, --fast            Use the smaller and faster versions of the models
  -i, --infile          The name of the text file to generate questions from.
                        If no file is given, questions are generated on user input

Example (User Input):

$ python run_qg.py
>The answer to life is 42. The answer to most other questions is unknowable.
{'answer': '42', 'question': 'What is the answer to life?'}
{'answer': 'unknowable', 'question': 'What is the answer to most other questions?'}

Example (File Input):

$ python run_qg.py -s -i data/text/slp_ch2.txt

Summary: The dialogue above is from ELIZA, an early natural language <...>

{'answer': 'Eliza', 'question': "Who's mimicry of human conversation was remarkably successful?"}
{'answer': 'restaurants', 'question': 'Modern conversational agents can answer questions, book flights, or find what?'}
{'answer': 'Regular expressions', 'question': 'What can be used to specify strings we might want to extract from a document?'}
...

These scripts will default to using GPU if it is available. It is highly recommended (but not required) to have access to a CUDA-capable GPU when running these models. They are quite large and take a long time to run on CPU.

Reproduction

To reproduce the results from the paper, use reproduction/run_experiments.py. This script will generate a file named out.csv that contains questions from all three sources (Automatic Summary, Original Text, Human Summary) separated by chapter subsection. If using the full-size models, this should take about 5-10 minutes on GPU.

$ python run_experiments.py -h
  -s, --use_summary  Run automatic summarization rather than reading in
                     automatic summary data from a file
  -f, --fast         Use the smaller and faster versions of the models

For example, this command will run the full QG model on all sources

$ cd reproduction
$ python run_experiments.py -s

To reproduce the coverage analysis, use reproduction/coverage.py. This script will print out the % of bolded key-terms from the textbook present in question-answer pairs in a given input csv file separated by textual source.

$ python coverage.py <keyword_file> <data_file>

For example, this command will run a coverage analysis on the data included in the paper. You may also choose to set data_file to the out.csv file to verify the coverage of your generated questions.

$ python coverage.py ../data/keywords/keywords.csv ../data/questions/questions.csv

Finally, to reproduce our analysis of annotations collected, use reproduction/analyze_annotations.py. This script will print out pairwise IAA and per-annotator statistics (Table 3) for each annotation questions as well as a breakdown across chapters (Table 5). It will also output the plot used in Figure 3 as summaries.pdf.

$ python analyze_annotations.py

Model Details

The QG models used and the inference code to run them come from Suraj Patil's amazing question_generation repository. Many thanks to him for sharing his great work with the academic community. Please see our paper for more details about the training and model inference.

Below are the evaluation results for the t5-base and t5-small models on the SQuAD1.0 dev set. For decoding, beam search with num_beams 4 was used with max decoding length set to 32. The nlg-eval package was used to calculate the metrics.

Name	BLEU-4	METEOR	ROUGE-L	QA-EM	QA-F1
t5-base-qa-qg-hl	21.0141	26.9113	43.2484	82.46	90.272
t5-small-qa-qg-hl	18.9872	25.2217	40.7893	76.121	84.904

Below are the evaluation results for the bart-large and distilbart models on the CNN/DailyMail test set.

Name	ROUGE-2	ROUGE-L
facebook/bart-large-cnn	21.06	30.63
sshleifer/distilbart-cnn-6-6	20.17	29.70

Citation

If you use our code or findings in your research, please cite us as:

@inproceedings{dugan-etal-2022-feasibility,
    title = "A Feasibility Study of Answer-Agnostic Question Generation for Education",
    author = "Dugan, Liam  and
      Miltsakaki, Eleni  and
      Upadhyay, Shriyash  and
      Ginsberg, Etan  and
      Gonzalez, Hannah  and
      Choi, DaHyeon  and
      Yuan, Chuning  and
      Callison-Burch, Chris",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-acl.151",
    doi = "10.18653/v1/2022.findings-acl.151",
    pages = "1919--1926",
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
reproduction		reproduction
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pipeline.py		pipeline.py
requirements.txt		requirements.txt
run_qg.py		run_qg.py
summary_qg.py		summary_qg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint Summarization & Question Generation

Installation

Usage

Reproduction

Model Details

Citation

About

Releases

Packages

Contributors 2

Languages

License

liamdugan/summary-qg

Folders and files

Latest commit

History

Repository files navigation

Joint Summarization & Question Generation

Installation

Usage

Reproduction

Model Details

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages