This is the code of our paper "Solving Inefficiency of Self-supervised Representation Learning" (https://arxiv.org/abs/2104.08760).
Guangrun Wang, Keze Wang, Guangcong Wang, Philip H.S. Torr, and Liang Lin
Just list a few pretrained models here:
Model | Top 1 Acc | Download |
---|---|---|
shorter epochs | 73.8% | ⬇️ |
longer epochs | 75.9% | ⬇️ |
An example of SSL training script on ImageNet:
bash tools/dist_train.sh configs/selfsup/triplet/r50_bs4096_accumulate4_ep1000_fp16_triplet_gpu3090 8
An example of linear evaluation script on ImageNet:
python tools/extract_backbone_weights.py xxxxxxxxx/ssl_ep940.pth xxxxxx/release_smooth_ep940.pth
bash benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/imagenet/r50_last_cos_for_MoreSslEpoch.py xxxxxx/release_smooth_ep940.pth
This repo can achieve a 73.8% top-1 accuracy for 200 epochs' SSL training and a 75.9% top-1 accuracy for 700-900 epochs' SSL training on Imagenet.
Method | top-1 accuracy | epochs |
---|---|---|
supervised | 76.3 | 100 |
supervised | 78.4 | 270 |
supervised + linear eval | 74.1 | 100 |
Random | 4.4 | 0 |
Relative-Loc | 38.8 | 200 |
Rotation-Pred | 47.0 | 200 |
DeepCluster | 46.9 | 200 |
NPID | 56.6 | 200 |
ODC | 53.4 | 200 |
SimCLR | 60.6 | 200 |
SimCLR | 69.3 | 1000 |
MoCo | 61.9 | 200 |
MoCo v2 | 67.0 | 200 |
MoCo v2 | 71.1 | 800 |
SwAV (single crop) | 69.1 | 200 |
SwAV (multi crop) | 72.7 | 200 |
BYOL | 71.5 | 200 |
BYOL | 72.5 | 300 |
BYOL | 74.3 | 1000 |
SimSiam | 68.1 | 100 |
SimSiam | 70.0 | 200 |
SimSiam | 70.8 | 400 |
SimSiam | 71.3 | 800 |
Triplet | 73.8 | 200 |
Triplet | 75.9 | 700-900 |
For object detection and instance segmentation tasks on COCO 2017, please go to the "triplet/benchmarks/detection/" folder and run the relevant scripts.
Note: For the organizational structure of the COCO 2017 dataset and the installation of the operating environment, please check the official documentation of Detectron2.
An example of training script on COCO 2017:
cd benchmarks/detection/
python convert-pretrain-to-detectron2.py xxxxxx/release_ep800.pth xxxxxx/release_detection_ep800.pkl
bash run.sh configs/coco_R_50_C4_2x_moco.yaml xxxxxx/release_detection_ep800.pkl
This repo can achieve a 41.7% AP(box) and a 36.2% AP(mask) on COCO 2017.
Method | AP(box) | AP(mask) |
---|---|---|
supervised | 40.0 | 34.7 |
Random | 35.6 | 31.4 |
Relative-Loc | 40.0 | 35.0 |
Rotation-Pred | 40.0 | 34.9 |
NPID | 39.4 | 34.5 |
SimCLR | 39.6 | 34.6 |
MoCo | 40.9 | 35.5 |
MoCo v2 | 40.9 | 35.5 |
BYOL | 40.3 | 35.1 |
Triplet | 41.7 | 36.2 |
For PASCAL VOC07+12 Object Detection, please go to the "triplet/benchmarks/detection/" folder and run the relevant scripts.
Note: For the organizational structure of the VOC07+12 dataset and the installation of the operating environment, please check the official documentation of Detectron2.
It is worth noting that because the VOC dataset is much smaller than the COCO 2017 dataset, multiple experiments should be performed on VOC, and the average of the results of the multiple experiments should be reported.
An example of training script on PASCAL VOC07+12:
cd benchmarks/detection/
python convert-pretrain-to-detectron2.py xxxxxx/release_ep800.pth xxxxxx/release_detection_ep800.pkl
bash run.sh configs/pascal_voc_R_50_C4_24k_moco.yaml xxxxxx/release_detection_ep800.pkl
This repo can achieve a 82.6% AP50(box), a 56.9% AP(box), and a 63.8% AP75(box) on VOC07+12.
Method | AP50 | AP | AP75 |
---|---|---|---|
supervised | 81.6 | 54.2 | 59.8 |
Random | 59.0 | 32.8 | 31.6 |
Relative-Loc | 80.4 | 55.1 | 61.2 |
Rotation-Pred | 80.9 | 55.5 | 61.4 |
NPID | 80.0 | 54.1 | 59.5 |
SimCLR | 79.4 | 51.5 | 55.6 |
MoCo | 81.4 | 56.0 | 62.2 |
MoCo v2 | 82.0 | 56.6 | 62.9 |
BYOL | 81.0 | 51.9 | 56.5 |
Triplet | 82.6 | 56.9 | 63.8 |
We next verify the effectiveness of our method on a more extensive data set, SYSU-30k, that is 30 times larger than ImageNet both in terms of category number and image number.
Currently, SYSU-30k supports both Google drive collection and Baidu Pan (code: 1qzv) collection.
According to the results of our latest run with the following scripts, the accuracy will be higher than the results we reported in our paper.
CUDA_VISIBLE_DEVICES=3,5,6,7 bash tools/dist_train.sh configs/selfsup/triplet/r50_bs4096_accumulate4_ep10_fp16_triplet_gpu3090_sysu30k.py 4 --pretrained /scratch/local/ssd/guangrun/tmp/release_ep940.pth
python tools/extract_backbone_weights.py work_dirs/selfsup/triplet/r50_bs4096_accumulate4_ep10_fp16_triplet_gpu3090_sysu30k/epoch_10.pth work_dirs/selfsup/triplet/extract/sysu_ep10.pth
cd Self-Supervised-ReID
python test_sysu_combine.py --gpu_ids 0 --name debug --test_dir /scratch/local/ssd/guangrun/sysu_test_resize --which_epoch 10 --batchsize 100
This repo has been tested in the following environment. More precisely, this repo is a modification on the OpenSelfSup. Installation and preparation follow that repo. Please acknowledge the great work of the team of OpenSelfSup.
For object detection and instance segmentation tasks, this repo follows OpenSelfSup and uses Detectron2. Thanks for their outstanding contributions.
Pytorch1.9