perception_test_iccv2023

Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.

Introduction

We achieves the best performance in Temporal Sound Localisation task and runner-up in Temporal Action Localisation task. In this repository, we provide the pretrained video&audio features, checkpoints, and codes for feature extraction, training, and inference.

Get Started

Please refer to INSTALL.md to install the prerequisite packages.

Feature Extraction

TAL

For the video features, we use the UMT large model pre-trained on Something Something-V2 and the VideoMAE model pre-trained on Ego4D-Verb dataset. The weights of Ego4d can be found here. These two features are concatenated before putting into the ActionFormer model during both training and inference stages.

For the audio features, we use the BEATs model as feature extractor and adopt its iter3+ checkpoints pre-trained on the AudioSet-2M dataset. we provide scripts to extract BEATs and CAV-MAE (although not used), please use python audio_feat_extract.py to extract audio features.

TSL

For the video feature, we use the UMT large model pre-trained on Something Something-V2 and fine-tuned on the perception test temporal action localisation training set.

For the audio features, we use the BEATs model as feature extractor and adopt its iter3+ checkpoints pre-trained on the AudioSet-2M dataset. we provide scripts to extract BEATs and CAV-MAE (although not used), please use python audio_feat_extract.py to extract audio features.

Download

Features	Modality	Task	Download Link
BEATs_iter2	Audio	TAL&TSL	Download
Ego4d_verb	Video	TAL	Download
UMT-L Sth Sth-V2	Video	TAL	Download
UMT-L Sth Sth-V2 ft	Video	TSL	Download

Temporal Sound Localisation

Training

cd ./tsl/

python train.py configs/perception_tsl_multi_train.yaml

Inference

Inference on the validation set:

cd ./tsl/

python eval.py configs/perception_tsl_multi_valid.yaml ./ckpt/XXX -epoch=XX

Inference on the test set:

cd ./tsl/

python eval.py configs/perception_tsl_multi_test.yaml ./ckpt/XXX -epoch=XX --saveonly

Temporal Action Localisation

cd ./tal/

python train.py configs/perception_tal_multi_train.yaml

Inference

Inference on the validation set:

cd ./tal/

python eval.py configs/perception_tal_multi_valid.yaml ./ckpt/XXX -epoch=XX

Inference on the test set:

cd ./tal/

python eval.py configs/perception_tal_multi_test.yaml ./ckpt/XXX -epoch=XX --saveonly

Checkpoints

We release the checkpoint in the below table.

Method	Task	mAP (Valid)	Download
BEATs + UMT	tsl	26.70	ckpt
BEATs + UMT ft	tsl	39.25	ckpt
BEATs + UMT	tal	44.14	ckpt
BEATs + UMT&VideoMAE	tal	46.75	ckpt

Contact

If you have any questions, please contact Jiashuo Yu and Guo Chen

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
tal		tal
tsl		tsl
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

perception_test_iccv2023

Introduction

Get Started

Feature Extraction

TAL

TSL

Download

Temporal Sound Localisation

Training

Inference

Temporal Action Localisation

Inference

Checkpoints

Contact

About

Releases

Packages

Contributors 2

Languages

License

OpenGVLab/perception_test_iccv2023

Folders and files

Latest commit

History

Repository files navigation

perception_test_iccv2023

Introduction

Get Started

Feature Extraction

TAL

TSL

Download

Temporal Sound Localisation

Training

Inference

Temporal Action Localisation

Inference

Checkpoints

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages