This repo is a PyTorch implementation applying VAN (Visual Attention Network) to 2D Object Detection. Our implementation is mainly based on VAN-Segmentation and MMDetection. Thanks to the authors.
More details about the VAN can be found in Visual Attention Network.
@article{guo2022visual,
title={Visual Attention Network},
author={Guo, Meng-Hao and Lu, Cheng-Ze and Liu, Zheng-Ning and Cheng, Ming-Ming and Hu, Shi-Min},
journal={arXiv preprint arXiv:2202.09741},
year={2022}
}
Install MMDetection and download COCO according to the guidelines in MMDetection.
We recommend following the official instructions for installing the Open-MMLab libraries, using mim. Otherwise, version mismatches are likely.
pip install wandb timm pycocotools openmim
mim install mmcv-full==1.7.1 mmdet==2.27.0
As getting the correct sets of versions correct can be tricky, we provide the exact enviroments used in our tests in the conda-freeze.txt
respective pip-freeze.txt
.
We used our own fork of MMDetection for our adapted copy-paste mechanism and our evaluation. (@commit 9e62e9b4f05aedf5b0b28e5c7619ef8e89097cc1)
We use 3 GPUs for training by default. Run:
./dist_train.sh /path/to/config 3
To evaluate the model, run:
./dist_test.sh /path/to/config /path/to/checkpoint_file 3 --eval bbox
Install torchprofile using
pip install torchprofile
To calculate FLOPs for a model, run:
bash tools/flops.sh /path/to/config --shape 1333 800
In our evaluation, we used the updated analysis tools of our MMDetection fork.
Our implementation is mainly based on VAN-Segmentation and MMDetection. Thanks to the authors.
This repo is under the Apache-2.0 license. For commercial use, please contact the authors.