Data, code, and model checkpoints for the ACL 2021 paper ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining!
The data can be accessed from this Google Drive link.
The data-non-processed
contains the original, non-processed data and is 27MB, while data-processed
contains the data for vanilla, -arg-filtered, and -arg-graph experiments, as well as model outputs, and is 611 MB.
Using the gdrive cli, download the folders with the following command
gdrive download --recursive 1HfyCMa1fQ5DkzME9RQZkytZQfyDjE1EK
The data can also be downloaded from this S3 bucket.
aws s3 cp --recursive s3://convosumm/data/ ./data
Please see this README for code details.
Model checkpoints can be downloaded from the S3 bucket (~80GB):
aws s3 cp --recursive s3://convosumm/checkpoints/ ./checkpoints