VT-SSum is a benchmark dataset with spoken language for video transcript segmentation and summarization
Source | train | dev | test |
---|---|---|---|
Vedio | 7,692 | 962 | 962 |
Evaluation results on the test data of VT-SSum with models fine-tuned on different datasets
Models | Top3-Precision | Top3-Recall | Top3-F1 | Top5-Precision | Top5-Recall | Top5-F1 |
---|---|---|---|---|---|---|
CNN/DM | 31.49 | 58.41 | 40.92 | 27.10 | 74.46 | 39.74 |
VT-SSum | 37.86 | 69.04 | 48.90 | 29.79 | 80.79 | 43.53 |
CNN/DM→VT-SSum | 38.10 | 69.48 | 49.22 | 29.88 | 81.02 | 43.66 |
Evaluation results on the test data of AMI with models fine-tuned on different datasets
Models | Top3-Precision | Top3-Recall | Top3-F1 | Top5-Precision | Top5-Recall | Top5-F1 |
---|---|---|---|---|---|---|
CNN/DM | 45.30 | 59.03 | 51.26 | 39.03 | 72.61 | 50.77 |
VT-SSum | 51.80 | 67.96 | 58.79 | 42.72 | 79.26 | 55.51 |
CNN/DM→VT-SSum | 52.66 | 68.72 | 59.62 | 42.99 | 79.78 | 55.87 |
Evaluation result of the segmentation on the test data of VT-SSum
Models | Accuracy |
---|---|
LSTM | 90.33 |
UniLMv2base | 92.14 |
UniLMv2large | 93.00 |
Each file(*.json) consists of 6 fields:
id
: The id of video.title
: The title of the current video.info
: Some information of current video, such as time of published/recorded.url
: The link to the current video.segmentation
: The segmentation part of the VT-SSum. This field consists of a list:where[ [sent_0 in seg_0, sent_1 in seg_0, ..., sent_n in seg_0], ..., [sent_0 in seg_k, sent_1 in seg_k, ..., sent_m in seg_k] ]
k
is the number of segments in the current video, andn
/m
is the number of sentences in the segment.summarization
: The summarization part of the VT-SSum. This field consists of a dict:where{ "clip_0": { "is_summarization_sample": true/false, "summarization_data": [ { "sent": "sent_0", "label": 0/1, }, ... { "sent": "sent_n", "label": 0/1, }, ] } ..., "clip_k": { ... } }
is_summarization_sample
indicates that the current segment has summary and be used in the training/evaluation of the summarization task.
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization [Preprint]
If you find VT-SSum useful in your research, please cite the following paper:
@article{lv2021vt,
title={VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization},
author={Lv, Tengchao and Cui, Lei and Vasilijevic, Momcilo and Wei, Furu},
journal={arXiv preprint arXiv:2106.05606},
year={2021}
}
This project is licensed under CC BY-NC-ND 4.0