Haopeng Li1, Andong Deng2, Qiuhong Ke*3, Jun Liu*4, Hossein Rahmani5, Yulan Guo6, Bernt Schiele7, Chen Chen2
1University of Melbourne, 2University of Central Florida, 3Monash University, 4Singapore University of Technology and Design, 5Lancaster University, 6Sun Yat-sen University, 7Max Planck Institute for Informatics
Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval. However, this task has not been explored due to the lack of relevant datasets and the challenging nature it presents. Most datasets for video question answering (VideoQA) focus mainly on general and coarse-grained understanding of daily-life videos, which is not applicable to sports scenarios requiring professional action understanding and fine-grained motion analysis. In this paper, we introduce the first dataset, named Sports-QA, specifically designed for the sports VideoQA task. The Sports-QA dataset includes various types of questions, such as descriptions, chronologies, causalities, and counterfactual conditions, covering multiple sports.
The numbers of QA pairs for different types and different sports.
Descriptive | Temporal | Causal | Counterfactual | Total | |
---|---|---|---|---|---|
Basketball | 5,629 | 22 | 785 | 278 | 6,714 |
Football | 6,659 | 1,355 | 1,949 | 523 | 10,486 |
Volleyball | 6,120 | 360 | 1,942 | 685 | 9,107 |
Gym | 6,382 | 1,997 | 0 | 0 | 8,379 |
Floor Exercise | 6,046 | 11,012 | 0 | 0 | 19,418 |
Balance Beam | 7,477 | 12,773 | 0 | 0 | 20,250 |
Uneven Bars | 7,294 | 12,124 | 0 | 0 | 17,058 |
Vault | 2,661 | 0 | 0 | 0 | 2,661 |
Total | 48,268 | 39,643 | 4,676 | 1,486 | 94,073 |
The distributions of answer classes broken down by question types.
Examples from Sports-QA.
Benchmark on Sports-QA
The Sports-QA dataset and the usage instruction are available here.
If you find our work useful in your research, please consider giving a star and citation.
@article{li2024sports,
title={Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports},
author={Li, Haopeng and Deng, Andong and Ke, Qiuhong and Liu, Jun and Rahmani, Hossein and Guo, Yulan and Schiele, Bernt and Chen, Chen},
journal={arXiv preprint arXiv:2401.01505},
year={2024}
}
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
- MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
- FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding
- IGV: Invariant Grounding for Video Question Answering
For further questions, please contact Haopeng Li (haopeng.li@student.unimelb.edu.au).