BMN: Boundary-Matching Network for Temporal Action Proposal Generation
Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, Shilei Wen
Temporal action proposal generation is an challenging and promising task which aims to locate temporal regions in real-world videos where action or event may occur. Current bottom-up proposal generation methods can generate proposals with precise boundary, but cannot efficiently generate adequately reliable confidence scores for retrieving proposals. To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map. Based on BM mechanism, we propose an effective, efficient and end-to-end proposal generation method, named Boundary-Matching Network (BMN), which generates proposals with precise temporal boundaries as well as reliable confidence scores simultaneously. The two-branches of BMN are jointly trained in an unified framework. We conduct experiments on two challenging datasets: THUMOS-14 and ActivityNet-1.3, where BMN shows significant performance improvement with remarkable efficiency and generalizability. Further, combining with existing action classifier, BMN can achieve state-of-the-art temporal action detection performance.
ActivityNet-1.3 with CUHK classifier.
Features | mAP@0.5 | mAP@0.75 | mAP@0.95 | ave. mAP | Config | Download |
---|---|---|---|---|---|---|
TSN | 50.97 | 34.98 | 8.35 | 34.21 | config | model | log |
TSP | 52.90 | 37.30 | 9.67 | 36.40 | config | model | log |
Use above checkpoints to evaluate the recall performance:
Features | AR@1 | AR@5 | AR@10 | AR@100 | AUC | Config | Download |
---|---|---|---|---|---|---|---|
TSN | 33.58 | 49.16 | 56.53 | 75.34 | 67.23 | config | model | log |
TSP | 34.14 | 51.35 | 58.44 | 76.24 | 68.47 | config | model | log |
THUMOS-14 with UtrimmedNet classifier.
Features | mAP@0.3 | mAP@0.4 | mAP@0.5 | mAP@0.6 | mAP@0.7 | ave. mAP | Config | Download |
---|---|---|---|---|---|---|---|---|
TSN | 60.51 | 56.03 | 47.56 | 38.23 | 28.64 | 46.19 | config | model | log |
I3D | 64.99 | 60.70 | 54.54 | 44.11 | 34.16 | 51.70 | config | model | log |
HACS with TCANet classifier.
Features | mAP@0.5 | mAP@0.75 | mAP@0.95 | ave. mAP | Config | Download |
---|---|---|---|---|---|---|
SlowFast | 52.64 | 36.18 | 11.46 | 35.78 | config | model | log |
You can use the following command to train a model.
torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/train.py ${CONFIG_FILE} [optional arguments]
Example: train BMN on ActivityNet dataset.
torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/train.py configs/bmn/anet_tsp.py
For more details, you can refer to the Training part in the Usage.
You can use the following command to test a model.
torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py ${CONFIG_FILE} --checkpoint ${CHECKPOINT_FILE} [optional arguments]
Example: test BMN on ActivityNet dataset.
torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py configs/bmn/anet_tsp.py --checkpoint exps/anet/bmn_tsp_128/gpu1_id0/checkpoint/epoch_9.pth
To test the recal performance:
torchrun --nnodes=1 --nproc_per_node=1 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 tools/test.py configs/bmn/anet_tsp_recall.py --checkpoint exps/anet/bmn_tsp_128/gpu1_id0/checkpoint/epoch_9.pth
For more details, you can refer to the Test part in the Usage.
@inproceedings{lin2019bmn,
title={Bmn: Boundary-matching network for temporal action proposal generation},
author={Lin, Tianwei and Liu, Xiao and Li, Xin and Ding, Errui and Wen, Shilei},
booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
pages={3889--3898},
year={2019}
}