This repo is the official implementation of our NeurIPS2022 paper "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning" (arXiv).
- Clone this repo:
git clone https://github.com/dongzelian/SSF.git
cd SSF
- Create a conda virtual environment and activate it:
conda create -n ssf python=3.7 -y
conda activate ssf
- Install
CUDA==10.1
withcudnn7
following the official installation instructions - Install
PyTorch==1.7.1
andtorchvision==0.8.2
withCUDA==10.1
:
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch
- Install
timm==0.6.5
:
pip install timm==0.6.5
- Install other requirements:
pip install -r requirements.txt
- FGVC & vtab-1k
You can follow VPT to download them.
Since the original vtab dataset is processed with tensorflow scripts and the processing of some datasets is tricky, we also upload the extracted vtab-1k dataset in onedrive for your convenience. You can download from here and then use them with our vtab.py directly. (Note that the license is in vtab dataset).
- CIFAR-100
wget https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
-
For ImageNet-1K, download it from http://image-net.org/, and move validation images to labeled sub-folders. The file structure should look like:
$ tree data imagenet ├── train │ ├── class1 │ │ ├── img1.jpeg │ │ ├── img2.jpeg │ │ └── ... │ ├── class2 │ │ ├── img3.jpeg │ │ └── ... │ └── ... └── val ├── class1 │ ├── img4.jpeg │ ├── img5.jpeg │ └── ... ├── class2 │ ├── img6.jpeg │ └── ... └── ...
-
Robustness & OOD datasets
Prepare ImageNet-A, ImageNet-R and ImageNet-C for evaluation.
-
For pre-trained ViT-B/16, Swin-B, and ConvNext-B models on ImageNet-21K, the model weights will be automatically downloaded when you fine-tune a pre-trained model via
SSF
. You can also manually download them from ViT,Swin Transformer, and ConvNext. -
For pre-trained AS-MLP-B model on ImageNet-1K, you can manually download them from AS-MLP.
To fine-tune a pre-trained ViT model via SSF
on CIFAR-100 or ImageNet-1K, run:
bash train_scripts/vit/cifar_100/train_ssf.sh
or
bash train_scripts/vit/imagenet_1k/train_ssf.sh
You can also find the similar scripts for Swin, ConvNext, and AS-MLP models. You can easily reproduce our results. Enjoy!
To evaluate the performance of fine-tuned model via SSF on Robustness & OOD, run:
bash train_scripts/vit/imagenet_a(r, c)/eval_ssf.sh
If this project is helpful for you, you can cite our paper:
@InProceedings{Lian_2022_SSF,
title={Scaling \& Shifting Your Features: A New Baseline for Efficient Model Tuning},
author={Lian, Dongze and Zhou, Daquan and Feng, Jiashi and Wang, Xinchao},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2022}
}
The code is built upon timm. The processing of the vtab-1k dataset refers to vpt, vtab github repo, and NOAH.