Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
anet_tsp.py		anet_tsp.py
fineaction_videomaev2_g.py		fineaction_videomaev2_g.py
hacs_slowfast.py		hacs_slowfast.py
thumos_videomaev2_g.py		thumos_videomaev2_g.py

README.md

DyFADet

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
Le Yang, Ziwei Zheng, Yizeng Han, Hao Cheng, Shiji Song, Gao Huang, Fan Li

Abstract

Recent proposed neural network-based Temporal Action Detection (TAD) models are inherently limited to extracting the discriminative representations and modeling action instances with various lengths from complex scenes by shared-weights detection heads. Inspired by the successes in dynamic neural networks, in this paper, we build a novel dynamic feature aggregation (DFA) module that can simultaneously adapt kernel weights and receptive fields at different timestamps. Based on DFA, the proposed dynamic encoder layer aggregates the temporal features within the action time ranges and guarantees the discriminability of the extracted representations. Moreover, using DFA helps to develop a Dynamic TAD head (DyHead), which adaptively aggregates the multi-scale features with adjusted parameters and learned receptive fields better to detect the action instances with diverse ranges from videos. With the proposed encoder layer and DyHead, a new dynamic TAD model, DyFADet, achieves promising performance on a series of challenging TAD benchmarks, including HACS-Segment, THUMOS14, ActivityNet-1.3, Epic-Kitchen 100, Ego4D-Moment QueriesV1.0, and FineAction.

Results and Models

ActivityNet-1.3

Features	Classifier	mAP@0.5	mAP@0.75	mAP@0.95	ave. mAP	Config	Download
TSP	InternVideo1	58.19	39.30	8.63	38.62	config	model \| log

THUMOS-14

Features	mAP@0.3	mAP@0.4	mAP@0.5	mAP@0.6	mAP@0.7	ave. mAP	Config	Download
VideoMAEv2-g	85.99	81.66	76.32	64.46	50.08	71.70	config	model \| log

FineAction

Features	Classifier	mAP@0.5	mAP@0.75	mAP@0.95	ave. mAP	Config	Download
VideoMAEv2_g_K710	InternVideo1	37.06	23.49	5.92	23.70	config	model \| log

HACS

Features	Classifier	mAP@0.5	mAP@0.75	mAP@0.95	ave. mAP	Config	Download
SlowFast	TCANet	58.09	40.03	11.96	39.45	config	model \| log

Train

You can use the following command to train a model.

torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/train.py ${CONFIG_FILE} [optional arguments]

Example: train DyFADet on THUMOS dataset.

torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/train.py configs/dyfadet/thumos_videomaev2_g.py

For more details, you can refer to the Training part in the Usage.

Test

You can use the following command to test a model.

torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py ${CONFIG_FILE} --checkpoint ${CHECKPOINT_FILE} [optional arguments]

Example: test DyFADet on THUMOS dataset.

torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py configs/dyfadet/thumos_videomaev2_g.py --checkpoint exps/thumos/dyfadet_videomaev2_g/gpu1_id0/checkpoint/epoch_37.pth

For more details, you can refer to the Test part in the Usage.

Citation

@inproceedings{yang2024dyfadet,
  title={DyFADet: Dynamic Feature Aggregation for Temporal Action Detection},
  author={Yang, Le and Zheng, Ziwei and Han, Yizeng and Cheng, Hao and Song, Shiji and Huang, Gao and Li, Fan},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dyfadet

dyfadet

README.md

DyFADet

Abstract

Results and Models

Train

Test

Citation

Files

dyfadet

Directory actions

More options

Directory actions

More options

Latest commit

History

dyfadet

Folders and files

parent directory

README.md

DyFADet

Abstract

Results and Models

Train

Test

Citation