Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector (CVPR'2020)

Abstract

Conventional methods for object detection typically require a substantial amount of training data and preparing such high-quality training data is very labor-intensive. In this paper, we propose a novel few-shot object detection network that aims at detecting objects of unseen categories with only a few annotated examples. Central to our method are our Attention-RPN, Multi-Relation Detector and Contrastive Training strategy, which exploit the similarity between the few shot support set and query set to detect novel objects while suppressing false detection in the background. To train our network, we contribute a new dataset that contains 1000 categories of various objects with high-quality annotations. To the best of our knowledge, this is one of the first datasets specifically designed for few-shot object detection. Once our few-shot network is trained, it can detect objects of unseen categories without further training or finetuning. Our method is general and has a wide range of potential applications. We produce a new state-of-the-art performance on different datasets in the few-shot setting. The dataset link is https://github.com/fanq15/Few-Shot-Object-Detection-Dataset.

Citation

@inproceedings{fan2020fsod,
    title={Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector},
    author={Fan, Qi and Zhuo, Wei and Tang, Chi-Keung and Tai, Yu-Wing},
    booktitle={CVPR},
    year={2020}
}

Note: ALL the reported results use the data split released from TFA official repo. Currently, each setting is only evaluated with one fixed few shot dataset. Please refer to DATA Preparation to get more details about the dataset and data preparation.

How to reproduce Attention RPN

Following the original implementation, it consists of 2 steps:

Step1: Base training
- use all the images and annotations of base classes to train a base model.
Step2: Few shot fine-tuning:
- use the base model from step1 as model initialization and further fine tune the model with few shot datasets.

An example of VOC split1 1 shot setting with 8 gpus

# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
    configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_base-training.py 8

# step2: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
    configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_1shot-fine-tuning.py 8

Note:

The default output path of base model in step1 is set to work_dirs/{BASE TRAINING CONFIG}/latest.pth. When the model is saved to different path, please update the argument load_from in step2 few shot fine-tune configs instead of using resume_from.
To use pre-trained checkpoint, please set the load_from to the downloaded checkpoint path.

Results on VOC dataset

Note:

The paper doesn't conduct experiments of VOC dataset. Therefore, we use the VOC setting of TFA to evaluate the method.
Some implementation details should be noticed:
- The training batch size are 8x2 for all the VOC experiments and 4x2 for all the COCO experiments(following the official repo).
- Only the roi head will be trained during few shot fine-tuning for VOC experiments.
- The iterations or training strategy for VOC experiments may not be the optimal.
The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.
The difficult samples will not be used in base training or few shot setting.

Base Training

Arch	Split	Base AP50	ckpt	log
r50 c4	1	71.9	ckpt	log
r50 c4	2	73.5	ckpt	log
r50 c4	3	73.4	ckpt	log

Few Shot Finetuning

Arch	Split	Shot	Novel AP50	ckpt	log
r50 c4	1	1	35.0	ckpt	log
r50 c4	1	2	36.0	ckpt	log
r50 c4	1	3	39.1	ckpt	log
r50 c4	1	5	51.7	ckpt	log
r50 c4	1	10	55.7	ckpt	log
r50 c4	2	1	20.8	ckpt	log
r50 c4	2	2	23.4	ckpt	log
r50 c4	2	3	35.9	ckpt	log
r50 c4	2	5	37.0	ckpt	log
r50 c4	2	10	43.3	ckpt	log
r50 c4	3	1	31.9	ckpt	log
r50 c4	3	2	30.8	ckpt	log
r50 c4	3	3	38.2	ckpt	log
r50 c4	3	5	48.9	ckpt	log
r50 c4	3	10	51.6	ckpt	log

Results on COCO dataset

Note:

Following the original implementation, the training batch size are 4x2 for all the COCO experiments.
The official implementation use different COCO data split from TFA, and we report the results of both setting. To reproduce the result following official data split (coco 17), please refer to Data Preparation to get more details about data preparation.
The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.

Base Training

Arch	data source	Base mAP	ckpt	log
r50 c4	TFA	23.6	ckpt	log
r50 c4	official repo	24.0	ckpt	log

Few Shot Finetuning

Arch	data source	Shot	Novel mAP	ckpt	log
r50 c4	TFA	10	9.2	ckpt	log
r50 c4	TFA	30	14.8	ckpt	log
r50 c4	official repo	10	11.6	ckpt	log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector (CVPR'2020)

Abstract

Citation

How to reproduce Attention RPN

An example of VOC split1 1 shot setting with 8 gpus

Results on VOC dataset

Base Training

Few Shot Finetuning

Results on COCO dataset

Base Training

Few Shot Finetuning

Files

README.md

Latest commit

History

README.md

File metadata and controls

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector (CVPR'2020)

Abstract

Citation

How to reproduce Attention RPN

An example of VOC split1 1 shot setting with 8 gpus

Results on VOC dataset

Base Training

Few Shot Finetuning

Results on COCO dataset

Base Training

Few Shot Finetuning