This is the source code of our ACM MM 2022 paper "SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization". Please cite the following paper if you use our code.
Hongbo Sun, Xiangteng He and Yuxin Peng, "SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization", 30th ACM Multimedia Conference (ACM MM), 2022.
Python 3.7.7
PyTorch 1.5.0
Torchvision 0.6.0
Download the CUB-200-2011 dataset and iNaturalist 2017 dataset from official websites and put them in corresponding folders.
Start training by executing the following commands. This will train the model on CUB-200-2011 dataset.
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --master_port 10715 --nproc_per_node=4 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --eval_every 1000 --fp16 --name sample_run --train_batch_size 5
For any questions, feel free to contact us (sunhongbo@pku.edu.cn).
Welcome to our Laboratory Homepage for more information about our papers, source codes, and datasets.