The official implement of SFTransT. Arxiv, IEEE T-CSVT
SFTransT follows the Siamese matching framework which takes the template and search frame as input. The Swin-Tiny network is adopted as the backbone, and the cross-scale features are fused as embedded features. Then, a Multi-Head Cross-Attention (MHCA) module is used to boost the interactions between the dual features. The output will be fed into our core component Spatial-Frequency Transformer, which models the Gaussian spatial prior and low-/high-frequency feature information simultaneously. More in detail, the GGN is adopted to predict the Gaussian spatial attention which will be added to the self-attention matrix. Then, the GPHA is designed to decompose them into low- and high-pass branches to achieve all-pass information propagation. Finally, the enhanced features will be fed into the classification and regression head for target object tracking.
Tracker | GOT-10K (AO) | LaSOT (AUC) | TrackingNet (AUC) | UAV123(AUC) | LaSOT-ext(AUC) | TNL2k(AUC) | WebUAV-3M |
SFTransT | 72.7 | 69.0 | 82.9 | 71.3 | 46.4 | 54.6 | 58.2 |
- Create and activate a conda environment
conda create -n sftranst python=3.7
conda activate sftranst
- Install the necessary packages. Please install them line by line to ensure the success.
conda install -c pytorch pytorch=1.5 torchvision=0.6.1 cudatoolkit=10.2
conda install matplotlib pandas tqdm
pip install opencv-python tb-nightly visdom scikit-image tikzplotlib gdown
conda install cython scipy
sudo apt-get install libturbojpeg
pip install pycocotools jpeg4py
pip install wget yacs
pip install shapely==1.6.4.post2
pip install timm
pip install einops
- Add the softlink of datasets into the path './dataset/'
- Setup Environment.
# Environment settings for ltr. Saved at ltr/admin/
cd SFTransT
python -c "from ltr.admin.environment import create_default_local_file; create_default_local_file()"
download pretrained model of Swin-Tiny, and put into the
run commmend
cd SFTransT/ltr
conda activate sftranst
python --train_module sftranst --train_name sftranst_cfa_gpha_mlp
- For UAV, OTB, GOT10k
cd SFTransT/pysot_toolkit
conda activate sftranst
python --cuda 0 --begin 99 --end 100 --interval 1 --folds sftranst_cfa_gpha_mlp --subset test
- For other datasets, like LaSOT:
python --dataset LaSOT --cuda 5 --epoch 300 --win 0.50
This is a combination version of the python tracking framework PyTracking
and PySOT-Toolkit.
Thanks for the TransT which firstly introduce the Transformer into visual tracking.
author={Tang, Chuanming and Wang, Xiao and Bai, Yuanchao and Wu, Zhe and Zhang, Jianlin and Huang, Yongmei},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
title={Learning Spatial-Frequency Transformer for Visual Object Tracking},
title={Learning Spatial-Frequency Transformer for Visual Object Tracking},
author={Tang, Chuanming and Wang, Xiao and Bai, Yuanchao and Wu, Zhe and Zhang, Jianlin and Huang, Yongmei},
journal={arXiv preprint arXiv:2208.08829},