Skip to content

[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

License

Notifications You must be signed in to change notification settings

JustinYuu/MM_Pyramid

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MM-Pyramid

[ACM MM 2022] MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

Jiashuo Yu, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang

Paper

Requirements

python==3.6.9  
torch==1.8.1  
torchvision==0.9.0
cuda==11.1  
numpy==1.19.5  

Data

Please refer to LLP and AVE for the required datasets.

Training

python main_avvp.py --mode=train

Testing

python main_avvp.py --mode=test

Citation

If you find our work interesting and useful, please consider citing it.

@article{yu2022mmp,
  title={MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing},
  author={Jiashuo Yu, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang},
  journal={arXiv preprint arXiv:2111.12374},
  year={2022}
}  

License

This project is released under the MIT License.

About

[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages