-
Apr. 16th, 2024
: Release Checkpoints. -
Mar. 29th, 2024
: Release training code for detection, working on re-training on accelerate version. -
Mar. 20th, 2024
: Release training code for classification, working on updating to accelerate version. -
Mar. 20th, 2024
: Code will be published in several days.
Prior efforts in light-weight model development mainly centered on CNN and Transformer-based designs yet faced persistent challenges. CNNs adept at local feature extraction compromise resolution while Transformers offer global reach but escalate computational demands
We will release all the pre-trained models/logs in few days.
- Classification on ImageNet-1K
name | pretrain | resolution | acc@1 | #params | FLOPs | checkpoints/logs |
---|---|---|---|---|---|---|
EfficientVMamba-T | ImageNet-1K | 224x224 | 76.5 | 6M | 0.8G | [ckpt]/[log] |
EfficientVMamba-S | ImageNet-1K | 224x224 | 78.7 | 11M | 1.3G | [ckpt]/[log] |
EfficientVMamba-B | ImageNet-1K | 224x224 | 81.8 | 33M | 4.0G | [ckpt]/[log] |
The installation steps are the same as VMamba.
step1:Clone the VMamba repository:
To get started, first clone the VMamba repository and navigate to the project directory:
git clone https://github.com/TerryPei/EfficientVMamba.git
cd EfficientVMamba
step2:Environment Setup: The install VMamba recommends setting up a conda environment and installing dependencies via pip. Use the following commands to set up your environment:
conda create -n vmamba
conda activate vmamba
pip install -r requirements.txt
# Install selective_scan and its dependencies
cd selective_scan && pip install . && pytest
Optional Dependencies for Model Detection and Segmentation:
pip install mmengine==0.10.1 mmcv==2.1.0 opencv-python-headless ftfy
pip install mmdet==3.3.0 mmsegmentation==1.2.2 mmpretrain==1.2.0
Classification:
To train VMamba models for classification on ImageNet, use the following commands for different configurations:
# For Tiny
python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=8 --master_addr="127.0.0.1" --master_port=29501 main.py --cfg configs/vssm/vssm_efficient_tiny.yaml --batch-size 128 --data-path /dataset/ImageNet2012 --output /tmp
# For Small
python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=8 --master_addr="127.0.0.1" --master_port=29501 main.py --cfg configs/vssm/vssm_efficient_small.yaml --batch-size 128 --data-path /dataset/ImageNet2012 --output /tmp
# For Base
python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=8 --master_addr="127.0.0.1" --master_port=29501 main.py --cfg configs/vssm/vssm_efficient_base.yaml --batch-size 128 --data-path /dataset/ImageNet2012 --output /tmp
Detection and Segmentation:
For detection and segmentation tasks, follow similar steps using the appropriate config files from the configs/vssm
directory. Adjust the --cfg
, --data-path
, and --output
parameters according to your dataset and desired output location.
If this paper helps your research, please consider citing our work:
@article{pei2024efficientvmamba,
title={EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba},
author={Pei, Xiaohuan and Huang, Tao and Xu, Chang},
journal={arXiv preprint arXiv:2403.09977},
year={2024}
}
This project is based on VMamba paper, code
, Mamba paper, code, Swin-Transformer (paper, code), ConvNeXt (paper, code), OpenMMLab,
and the analyze/get_erf.py
is adopted from replknet, thanks for their excellent works.
This project is released under the Apache 2.0 license.