2022CVPR-Modeling-Motion-with-Multi-Modal-Features-for-Text-Based-Video-Segmentation

🔥🔥🔥Update 2023.02.19🔥🔥🔥

2022CVPR-Modeling-Motion-with-Multi-Modal-Features-for-Text-Based-Video-Segmentation

This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"

Framework

Usage

Download A2D-Sentences and JHMDB-Sentences. Then, please convert the raw data into image frames.
Please use RAFT to generate the opticla flow map (visualize in RGB format) from frame t to frame t+1. Since there are only a few frames annotated in A2D and JHMDB, we only need to generate optical flow maps for these frames.
Put them as follows:

your dataset dir/
└── A2D/ 
    ├── allframes/  
    ├── allframes_flow/
    ├── Annotations_visualize
    ├── a2d_txt
        └──train.txt
        └──test.txt
└── J-HMDB/ 
    ├── allframes/  
    ├── allframes_flow/
    ├── Annotations_visualize
    ├── jhmdb_txt
        └──train.txt
        └──test.txt

"Annotations_visualize" contains the GT masks for each target object. We have upload them to BaiduPan(lo50) for convenience.

Download pretrained ResNet-101 and BETR.
We provide the pretrained checkpoint B+M+T+L+A(u5hx)

Citation

Please consider to cite our work in your publications if you are interest in our research:

@inproceedings{zhao2022modeling,
  title={Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation},
  author={Zhao, Wangbo and Wang, Kai and Chu, Xiangxiang and Xue, Fuzhao and Wang, Xinchao and You, Yang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11737--11746},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
dataset		dataset
model		model
refer		refer
util		util
README.md		README.md
a2d_test.py		a2d_test.py
framework.png		framework.png
hyper_para.py		hyper_para.py
jhmdb_test.py		jhmdb_test.py
train.py		train.py
train_distributed_launch.py		train_distributed_launch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

2022CVPR-Modeling-Motion-with-Multi-Modal-Features-for-Text-Based-Video-Segmentation

Framework

Usage

Citation

About

Releases

Packages

Languages

wangbo-zhao/2022CVPR-MMMMTBVS

Folders and files

Latest commit

History

Repository files navigation

2022CVPR-Modeling-Motion-with-Multi-Modal-Features-for-Text-Based-Video-Segmentation

Framework

Usage

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages