GitHub - tsly123/FreqFiT: Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

(Left) Overview of FreqFit. (Right) Performance gains with (left) Imagenet-21K and (right) MoCo.

This repository is heavily based on the official PyTorch implementation of Visual Prompt Tuning (ECCV22)

Environment settings

See env_setup.sh or assets/freqfit.yml

Experiments

Datasets preparation

Please follow the VPT Datasets preperation and VTAB_SETUP.md

Pre-trained model preparation

Download and place the pre-trained Transformer-based backbones to the pretrained folder or to MODEL.MODEL_ROOT.

Note that, for MoCo v3, different from VPT, we use the self-supervised pre-trained weights.

Once downloaded, modify the pre-trained backbones names MODEL_ZOO in src/build_vit_backbone.py accordingly.

Pre-trained Backbone	Pre-trained Objective	Link
ViT-B/16	Supervised	link
ViT-B/16	MoCo v3	link
ViT-B/16	MAE	link
ViT-B/16	CLIP	link

Key Configs

Configs related to certain PEFT method are listed in src/config/configs.py. They can also be changed in the run.sh.

This repo support FreqFit and Scale-Shift fine-tuning methods as presented in the paper. To change the supported method, go to run.sh and change to FREQFIT "freqfit" or FREQFIT "ssf".

FreqFit code

The code for FreqFit method is in src/models/gfn.py
The code for integrate FreqFit into PEFT method can be found in the vision transformer backbone of methods vit.py, such as src/models/vit_backbones/vit.py.

Adding new PEFT method

To add new PEFT methods that are available in HuggingFace. Simply go to src/models/vit_models.py

...
# add VERA 
elif transfer_type == "vera":
    from peft import VeraConfig, get_peft_model
    """
    https://huggingface.co/docs/peft/en/package_reference/vera
    """
    config = VeraConfig(
        r=cfg.MODEL.VERA.R,
        target_modules=["attn.query", "attn.value", "attn.key", "attn.out", "ffn.fc1", "ffn.fc2"],
        vera_dropout =0.1,
        bias="vera_only",
        modules_to_save=["classifier"],
    )

    self.enc = get_peft_model(self.enc, config)
    for k, p in self.enc.named_parameters():
        if "ssf_scale" in k or "ssf_shift" in k or "filter_layer" in k:
            p.requires_grad = True
...

In the run.sh, modify MODEL.TRANSFER_TYPE "vera". Refer to HuggingFace for config details.

To add custom PEFT method, build your custom method, then add it to add the custom method to src/models/build_vit_backbone.py and src/models/vit_models.py. Refer to LoRA at src/models/vit_lora/vit_lora.py and as an example.

Run experiments

Modify the run.sh as your reference. Then run:

bash run.sh [data_name] [encoder] [batch_size] [base_lr] [wd_lr] [num_tokens] [adapter_ratio] [freqfit/ssf]

For example for the Cifar100 dataset on Imagenet-21k with LoRA incorporate with FreqFit, make sure the MODEL.TRANSFER_TYPE and other LoRA configs have been set in run.sh

--config-file configs/finetune/cub.yaml \
MODEL.TRANSFER_TYPE "lora" \
MODEL.LORA.RANK "8" \
MODEL.LORA.ALPHA "8" \

Then, execute:

bash run.sh cifar100 sup_vitb16_imagenet21k 64 0.1 0.01 0 0 freqfit

License

The majority of FreqFiT is licensed under the CC-BY-NC 4.0 license (see LICENSE for details). Portions of the project are available under separate license terms: GitHub - google-research/task_adaptation and huggingface/transformers are licensed under the Apache 2.0 license; Swin-Transformer, ConvNeXt and ViT-pytorch are licensed under the MIT license; and MoCo-v3 and MAE are licensed under the Attribution-NonCommercial 4.0 International license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

Environment settings

Experiments

Datasets preparation

Pre-trained model preparation

Key Configs

FreqFit code

Adding new PEFT method

Run experiments

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
assets		assets
configs		configs
pretrained		pretrained
src		src
LICENSE		LICENSE
README.md		README.md
VTAB_SETUP.md		VTAB_SETUP.md
env_setup.sh		env_setup.sh
launch.py		launch.py
run.sh		run.sh
train.py		train.py
tune_fgvc.py		tune_fgvc.py
tune_vtab.py		tune_vtab.py

License

tsly123/FreqFiT

Folders and files

Latest commit

History

Repository files navigation

Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

Environment settings

Experiments

Datasets preparation

Pre-trained model preparation

Key Configs

FreqFit code

Adding new PEFT method

Run experiments

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages