Visual object tracking has been utilized in numerous aerial platforms, where is facing the challenges of more extremely complex conditions. To address the inefficient long-range modeling of traditional networks with fully convolutional neural networks, Transformer is introduced into the state-of-the-art trackers’ frameworks. Benefiting from full receptive field of global attention, these Transformer trackers can efficiently model long-range information. However, the structure of vanilla Transformer is lack of enough inductive bias and directly adopting global attention will lead to overfocusing on global information which does harm to modeling local details. This work proposes a local perception-aware Transformer for aerial tracking, i.e., LPAT. Specifically, this novel tracker is constructed with modified local-recognition attention and local element correction network to process information via local-modeling to global-search mechanism. To grab local details and strengthen the local inductive bias of Transformer structure. The Transformer encoder with localrecognition attention is constructed to fuse local features for accurate feature modeling and the local element correction network can strengthen the capability of both Transformer encoder and decoder to distinguish local details. The proposed method achieves competitive accuracy and robustness in several benchmarks with 316 sequences in total. The proposed tracker’s practicability and efficiency have been validated by the realworld tests on a typical aerial platform.
This figure shows the workflow of our tracker.
- 📹 Some qualitative evaluations and real-world tests are reported in the demo, which demonstrates the practicality of LPAT-tracker.
This code has been tested on Ubuntu 18.04, Python 3.8.3, Pytorch 0.7.0/1.6.0, CUDA 10.2. Please install related libraries before running this code:
pip install -r requirements.txt
Download pretrained model: general_model from BaiduNetdisk(code:o91m) or GoogleDrive, and put it into tools/snapshot
directory.
Download testing datasets and put them into test_dataset
directory. If you want to test the tracker on a new dataset, please refer to pysot-toolkit to set test_dataset.
python test.py \
--dataset UAV10fps \ #
--dataset_name
--snapshot snapshot/general_model.pth # tracker_name
The testing result will be saved in the results/dataset_name/tracker_name
directory.
Download the datasets:
Note: train_dataset/dataset_name/readme.md
has listed detailed operations about how to generate training datasets.
To train the SiamAPN model, run train.py
with the desired configs:
cd tools
python train.py
We provide the tracking results (code: 0311) of UAV123@10fps, UAV123, and DTB70. If you want to evaluate the tracker, please put those results into results
directory.
python eval.py \
--tracker_path ./results \ # result path
--dataset UAV10fps \ # dataset_name
--tracker_prefix 'general_model' # tracker_name
The code is implemented based on pysot (Bo Li et al.) and the tracking procedure is borrowed from HiFT (Ziang Cao et al.). We would like to express our sincere thanks to the contributors.