GitHub - skgabriel/paracomet: Code and data for the paper: "Paragraph-level Commonsense Transformers with Recurrent Memory"

Paragraph-level Commonsense Transformers with Recurrent Memory

This repository contains the code used in the paper:

Paragraph-level Commonsense Transformers with Recurrent Memory. Saadia Gabriel, Chandra Bhagavatula, Vered Shwartz, Ronan Le Bras, Maxwell Forbes, Yejin Choi. AAAI 2021. [ArXiv]

This is a general purpose framework for aligning commonsense knowledge with narrative text. The repo contains

A framework for distantly supervised paragraph-level commonsense knowledge alignment; and
Modeling code for finetuning pretrained transformers to generate paragraph-level commonsense inferences.

Examples

Our models infer about the social commonsense underlying narratives (i.e. actions or changes in emotional states of characters related to what would likely happen or has likely already happened at a particular point in a narrative):

Story	Model Prediction
Almost a month ago now, the radio station got struck by lightning. It fried the router and the cable modem...	[Past] PersonX wanted to have better internet, PersonX wanted to not be bothered
I posted a moment ago about a girl I asked out...she said she would like to do something, but work made it difficult. That was a couple of weeks back...	[Future] PersonX will be asked out, PersonX will be rejected

Instructions

python >= 3.6 (I suggest using a virtual environment)

Note: For now, the code assumes that stories contain at most 5 sentences and models generate inferences for up to 5 sentence stories.

Setup

pip install -r requirements.txt 
cd data
wget https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz
tar -xvzf atomic_data.tgz

Note: torch and torchvision may need to be installed with a specified cuda version to run on GPUs. (For example, if the cuda version is 10.1, the install command for torch should be torch=={torch_version}+101)

Prep for Distant Supervision

Data for distant supervision should be a file "train-processed.jsonl" in the data folder. The file should contain the following keys:

dict_keys(['storyid', 'storytitle', 'sentence1', 'sentence2', 'sentence3', 'sentence4', 'sentence5', 'sentence1_tokens', 'sentence1_noun_phrases', 'sentence1_verb_phrases', 'sentence2_tokens', 'sentence2_noun_phrases', 'sentence2_verb_phrases', 'sentence3_tokens', 'sentence3_noun_phrases', 'sentence3_verb_phrases', 'sentence4_tokens', 'sentence4_noun_phrases', 'sentence4_verb_phrases', 'sentence5_tokens', 'sentence5_noun_phrases', 'sentence5_verb_phrases'])

Get the preprocessed data from link and place in data folder

Distant Supervision (Heuristic)

cd src/ds
python distant_supervision.py --target_dir ../../data/atomic

Distant Supervision (COMET)

Get pretrained comet model from link and place in data folder

cd src/ds
python distant_supervision.py --comet --comet_location ../../data --target_dir ../../data/atomic

Processing Data for Training Models

Combine distantly supervised data into a single file "all_data.jsonl" by running combine_files.py in ds folder

Split data using split.py in ds folder

Change format between comet and heuristic data by setting comet = True or comet = False in split.py file

For comet data, files are in the format "c_atomic_{split}.txt"

For heuristic data, files are in the format "h_atomic_{split}.txt"

Train (w/o Memory)

cd src/gpt (or src/gpt2) 
python finetune_model.py --log_dir ./log --model_dir ./models --data_dir ../../data --use_mem False --comet True

Train (Memory)

cd src/gpt (or src/gpt2) 
python finetune_model.py --log_dir ./mem_log --model_dir ./mem_models --data_dir ../../data --use_mem True --comet True

Decode (w/o Memory)

cd src/gpt (or src/gpt2) 
python decode.py --model_type ./models/model --original_file '../../data/c_atomic_test.jsonl' --data_dir ../../data --save_dir ../../data/gen_data --save_filename 'outputs.jsonl' --load_epoch 8

Decode (Memory)

cd src/gpt (or src/gpt2) 
python decode.py --model_type ./mem_models/model --original_file '../../data/c_atomic_test.jsonl' --data_dir ../../data --save_dir ../../data/gen_data --save_filename 'outputs.jsonl' --load_epoch 9 --use_mem True

Note: Make sure to decode using one GPU for memory model. (If using a multi GPU setting, specify CUDA_VISIBLE_DEVICES={device_id} before the python command)

Evaluation Data

Data

Evaluation scripts are included under src/eval, including scripts to reformat generated inferences for NLI evaluation. To run the full NLI evaluation, refer to the original github repo for SemBERT.

Pretrained Models

Model Name	Model Type	Link
mem	Para-M (w Memory)	link
nomem	Para-M (w/o Memory)	link

Note: Models were trained in a multi GPU setting, so ensure args.use_multigpu is set to True when decoding from pretrained models.

Interactive Mode

To most easily run pretrained models with your own data, use the narrative inference demo repo here. Demo code will be integrated into the main github repo at a later point in time.

References

Please cite this repository using the following reference:

@inproceedings{Gabriel2021ParagraphLevelCT,
title={Paragraph-level Commonsense Transformers with Recurrent Memory},
author={Gabriel, Saadia and Bhagavatula, Chandra and Shwartz, Vered and Le Bras, Ronan and Forbes, Maxwell and Choi, Yejin},
booktitle={AAAI},
year={2021},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paragraph-level Commonsense Transformers with Recurrent Memory

Examples

Instructions

Setup

Prep for Distant Supervision

Distant Supervision (Heuristic)

Distant Supervision (COMET)

Processing Data for Training Models

Train (w/o Memory)

Train (Memory)

Decode (w/o Memory)

Decode (Memory)

Evaluation Data

Pretrained Models

Interactive Mode

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
data		data
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

skgabriel/paracomet

Folders and files

Latest commit

History

Repository files navigation

Paragraph-level Commonsense Transformers with Recurrent Memory

Examples

Instructions

Setup

Prep for Distant Supervision

Distant Supervision (Heuristic)

Distant Supervision (COMET)

Processing Data for Training Models

Train (w/o Memory)

Train (Memory)

Decode (w/o Memory)

Decode (Memory)

Evaluation Data

Pretrained Models

Interactive Mode

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages