SKDFSKDFSKDF

Abstract

Open World Object Detection (OWOD) is a novel computer vision task with a considerable challenge, bridging the gap between classic object detection (OD) benchmarks and real-world object detection. In addition to detecting and classifying seen/known objects, OWOD algorithms are expected to localize all potential unseen/unknown objects and incrementally learn them. The large pre-trained vision-language grounding models (VLM, \eg, GLIP) have rich knowledge about the open world, but are limited by text prompts and cannot localize indescribable objects. However, there are many detection scenarios in which pre-defined language descriptions are unavailable during inference. In this paper, we attempt to specialize the VLM model for OWOD tasks by distilling its open-world knowledge into a language-agnostic detector. Surprisingly, we observe that the combination of a simple knowledge distillation approach and the automatic pseudo-labeling mechanism in OWOD can achieve better performance for unknown object detection, even with a small amount of data. Unfortunately, knowledge distillation for unknown objects severely affects the learning of detectors with conventional structures for known objects, leading to catastrophic forgetting. To alleviate these problems, we propose the down-weight loss function for knowledge distillation from vision-language to single vision modality. Meanwhile, we propose the cascade decouple decoding structure that decouples the learning of localization and recognition to reduce the impact of category interactions of known and unknown objects on the localization learning process. Ablation experiments demonstrate that both of them are effective in mitigating the impact of open-world knowledge distillation on the learning of known objects. Additionally, to alleviate the current lack of comprehensive benchmarks for evaluating the ability of the open-world detector to detect unknown objects in the open world, we propose two benchmarks, which we name StandardSet and IntensiveSet respectively, based on the complexity of their testing scenarios. Comprehensive experiments performed on OWOD, MS-COCO, and our proposed benchmarks demonstrate the effectiveness of our methods.

Installation

Requirements

We have trained and tested our models on Ubuntu 16.0, CUDA 10.2, GCC 5.4, Python 3.7

conda create -n SKDF python=3.7 pip
conda activate SKDF
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Backbone features

Download the self-supervised backbone from here and add in models folder.

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Dataset & Results

OWOD proposed splits

Results

	Task1		Task2		Task3		Task4
Method	U-Recall	mAP	U-Recall	mAP	U-Recall	mAP	mAP
ORE-EBUI	4.9	56.0	2.9	39.4	3.9	29.7	25.3
OW-DETR	7.5	59.2	6.2	42.9	5.7	30.8	27.8
SKDF	39.0	56.8	36.7	40.3	36.1	30.1	26.9

Weights

T1-T4 weight

OWDETR proposed splits

Results

	Task1		Task2		Task3		Task4
Method	U-Recall	mAP	U-Recall	mAP	U-Recall	mAP	mAP
ORE-EBUI	1.5	61.4	3.9	40.6	3.6	33.7	31.8
OW-DETR	5.7	71.5	6.2	43.8	6.9	38.5	33.1
SKDF	60.9	69.4	60.0	44.4	58.6	40.1	39.7

Weights

T1-T4 weight

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Acknowledgments:

SKDF builds on previous works code base such as OWDETR,Deformable DETR and OWOD. If you found SKDF useful please consider citing these works as well.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
data		data
datasets		datasets
models		models
tools		tools
util		util
README.md		README.md
benchmark.py		benchmark.py
engine.py		engine.py
main_open_world.py		main_open_world.py
run.sh		run.sh
run_eval.sh		run_eval.sh
run_slurm.sh		run_slurm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SKDFSKDFSKDF

Abstract

Installation

Requirements

Backbone features

Compiling CUDA operators

Dataset & Results

OWOD proposed splits

Results

Weights

OWDETR proposed splits

Results

Weights

License

About

Releases

Packages

Languages

xiaomabufei/SKDF

Folders and files

Latest commit

History

Repository files navigation

SKDFSKDFSKDF

Abstract

Installation

Requirements

Backbone features

Compiling CUDA operators

Dataset & Results

OWOD proposed splits

Results

Weights

OWDETR proposed splits

Results

Weights

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages