PyTorch training code for Contextual Similarity Distillation for Asymmetric Image Retrieval. We propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of the large gallery model, which is crucial with asymmetric retrieval.
What it is. A well trained gallery model
About the code. SSP is very simple to implement and experiment with. Training code follows this idea - it is not a library, but simply a Ours_training.py importing model and criterion definitions with standard training loops.
- Python 3
- PyTorch tested on 1.7.1+, torchvision 0.8.2+
- numpy
- matplotlib
- timm==0.4.12
- faiss>=1.6.3
There are no extra compiled components in SSP and package dependencies are minimal, so the code is very simple to use. We provide instructions how to install dependencies via conda. Install PyTorch 1.7.1+ and torchvision 0.8.2+:
conda install -c pytorch pytorch torchvision
Before going further, please check out Filip Radenovic's great repository on image retrieval. We use his code and model to extract features for training images. If you use this code in your research, please also cite their work! link to license
You should also checkout Google landmarkv2 github and SfM120k website. We use their training images. If you use these images in your research, please also cite their work!
Download and extract SfM120k train and val images with annotations from http://cmp.felk.cvut.cz/cnnimageretrieval/.
Download and extract Google landmarkv2 train and val images with annotations from https://github.com/cvdfoundation/google-landmark.
Download ROxf and RPar datastes with annotations. We expect the directory structure to be the following:
/data/
├─ R101-DELG.pth # large gallery model
├─ oldclassifier.pkl # classifier from large gallery model
├─ train # training datasets
| ├─ GLDv2
| | ├─ train.csv
| | ├─ GLDv2_Triplet.pkl
| | ├─ train_clean.csv
| | ├─ GLDv2-clean-train-split.pkl
| | ├─ GLDv2-clean-val-split.pkl
| | └─ train
| └─ retrieval-SfM-120k
| ├─ ims
| └─ retrieval-SfM-120k.pkl
├─ PQ_centroids # anchor points
| ├─ R1M_DELG-R101-Paris-M-PQ_32_256_centroids.pkl
| ├─ R1M_DELG-R101-Paris-M-PQ_16_256_centroids.pkl
| ├─ R1M_GeM-R101-PQ_32_256_centroids.pkl
| └─ R1M_GeM-R101-PQ_32_256_centroids.pkl
├─ train_features # training features
| ├─ SFM_R101_DELG.pkl
| ├─ GLDv2_R101_DELG.pkl
| ├─ R1M_R101_DELG.pkl
| ├─ R1M_R101_GeM.pkl
| ├─ SFM_R101_GeM.pkl
| └─ GLDv2_R101_GeM.pkl
├─ test_features # testing features
| ├─ R101-DELG-rparis6k.pkl
| ├─ R101-DELG-roxford5k.pkl
| ├─ R101-GeM-rparis6k.pkl
| └─ R101-GeM-roxford5k.pkl
└─test # testing images
├─ roxford5k
| ├─ jpg
| └─ gnd_roxford5k.pkl
└─ rparis6k
├─ jpg
└─ gnd_rparis6k.pkl
Extract features of training datasets with large gallery model
sh ./scripts/extract_feature_R101_DELG.sh or sh ./scripts/extract_feature_R101_GeM.sh
Generate anchor points with another training datasets
sh ./scripts/anchor_points_generation.sh
When using GLDv2 as training dataset, to train SSP on a single node with 2 gpus for 5 epochs run:
sh ./scripts/experiment_ours_GLDv2.sh
When using SfM120k as training dataset, to train SSP on a single node with 1 gpu for 10 epochs run:
sh ./scripts/experiment_ours_SFM.sh
We also provide implementations of other comparison methods. All models are trained with SGD setting learning rate to 0.001, and linearly decaying scheduler is adopted to gradually decay the learning rate to 0 when the desired number of steps is reached.
HVS and LCE need to use the classifier of the large gallery model and the corresponding training dataset, so they can only be trained on GLDv2.
For HVS:
sh ./scripts/experiment_HVS_GLDv2.sh
For LCE:
sh ./scripts/experiment_LCE_GLDv2.sh
AML requires triplet annotations. To train it with SfM120k dataset run:
sh ./scripts/experiment_AML_SFM.sh
When using GLDv2 as the training set, it is necessary to generate the triplet annotations first. For a training image, we consider the images of the same category as its positive samples and the images of other categories as its negative samples. The code used to generate the annotations is in GLDv2_build_contrastive_dataset.
Then run:
sh ./scripts/experiment_AML_GLDv2.sh