UniSoccer: Towards Universal Soccer Video Understanding

This repository contains the official PyTorch implementation of paper "Towards Universal Soccer Video Understanding": https://arxiv.org/abs/2412.01820/.

Project Page $\cdot$ Paper $\cdot$ Dataset (Soon) $\cdot$ Checkpoints

News

[2025.01] We open-sourced our codes and checkpoints for UniSoccer.
[2024.12] Our pre-print paper is released on arXiv.

Requirements

Python >= 3.8 (Recommend to use Anaconda or Miniconda)
PyTorch >= 2.0.0 (If use A100)
transformers >= 4.42.3
pycocoevalcap >= 1.2

A suitable conda environment named UniSoccer can be created and activated with:

conda env create -f environment.yaml
conda activate UniSoccer

Train

Pretrain MatchVision Encoder

As described in paper, we have two methods for pretraining MatchVision backbone (supervised classification & contrastive commentary). You can train both this two methods as following shows:

First of all, you should prepare textual data as the format in train_data/json, and preprocess soccer videos into 30 second clips (15s before and after timestamps) for pretraining.

Supervised Classification

python task/pretrain_MatchVoice_Classifier.py config/pretrain_classification.py

Contrastive Commentary Retrieval

python task/pretrain_contrastive.py config/pretrain_contrastive.py

Also, you could finetune MatchVision with

python task/finetune_contrastive.py config/finetune_contrastive.py

To be noted, you should replace the folders in task and config files.

Train Downstream Tasks

You could train the commentary task by several different methods:

Use mp4 files

python task/downstream_commentary_new_benchmark.py

For this method, you might train the commentary model MatchVoice with open visual encoder or language decoder, so you should crop the videos as 30s clips named as json files shows.

Use .npy files

python task/downstream_commentary.py

For this method, you cannot open the visual encoder, so you can extract features of all video clips and change ".mp4" by ".npy" as file names.

To be noted, folder words_world records the token ids of all words in LLaMA-3(8B) tokenizer of different datasets as

match_time.pkl: MatchTime dataset (Link here)
soccerreplay-1988.pkl: SoccerReplay-1988 dataset. (Not released yet)
merge.pkl: Union set of MatchTime & SoccerReplay-1988

Inference

For inference, you could use the following codes, be sure that you have correctly crop the video clips, which is in the same format as before.

python inference/inference.py

Then, you could test the metrics for output sample.csv by:

python inference/score_single.py --csv_path inference/sample.csv

Citation

If you use this code and data for your research or project, please cite:

@misc{rao2024unisoccer,
        title   = {Towards Universal Soccer Video Understanding},
        author  = {Rao, Jiayuan and Wu, Haoning and Jiang, Hao and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
        journal = {arXiv preprint arXiv:2412.01820},
        year    = {2024},
  }

TODO

Acknowledgements

Many thanks to the code bases from Video-LLaMA and MatchTime, and source data from SoccerNet-Caption and MatchTime.

Contact

If you have any questions, please feel free to contact jy_rao@sjtu.edu.cn or haoningwu3639@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
dataset		dataset
inference		inference
model		model
optimizer		optimizer
task		task
train_data		train_data
utils		utils
words_world		words_world
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
environment.yaml		environment.yaml
inference.png		inference.png
teaser.gif		teaser.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniSoccer: Towards Universal Soccer Video Understanding

News

Requirements

Train

Pretrain MatchVision Encoder

Train Downstream Tasks

Inference

Citation

TODO

Acknowledgements

Contact

About

Releases

Packages

Contributors 2

Languages

License

jyrao/UniSoccer

Folders and files

Latest commit

History

Repository files navigation

UniSoccer: Towards Universal Soccer Video Understanding

News

Requirements

Train

Pretrain MatchVision Encoder

Train Downstream Tasks

Inference

Citation

TODO

Acknowledgements

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages