Hybrid Proposal Refiner

This is the official implementation of the paper "Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective".

Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective

Jinjing Zhao*, Fangyun Wei*, Chang Xu

The University of Sydney

TODO

Update the checkpoints

Introduction

With the transformative impact of the Transformer, DETR pioneered the application of the encoder-decoder architecture to object detection. A collection of follow-up research, e.g., Deformable DETR, aims to enhance DETR while adhering to the encoder-decoder design. In this work, we revisit the DETR series through the lens of Faster R-CNN. We find that the DETR resonates with the underlying principles of Faster R-CNN's RPN-refiner design but benefits from end-to-end detection owing to the incorporation of Hungarian matching. We systematically adapt the Faster R-CNN towards the Deformable DETR, by integrating or repurposing each component of Deformable DETR, and note that Deformable DETR's improved performance over Faster R-CNN is attributed to the adoption of advanced modules such as a superior proposal refiner (e.g., deformable attention rather than RoI Align). When viewing the DETR through the RPN-refiner paradigm, we delve into various proposal refinement techniques such as deformable attention, cross attention, and dynamic convolution. These proposal refiners cooperate well with each other; thus, we synergistically combine them to establish a Hybrid Proposal Refiner (HPR). Our HPR is versatile and can be incorporated into various DETR detectors. For instance, by integrating HPR to a strong DETR detector, we achieve an AP of 54.9 on the COCO benchmark, utilizing a ResNet-50 backbone and a 36-epoch training schedule.

Main Results

Results on COCO with ResNet-50

Base Model	Epoch	w/LSJ	AP	Configs	Checkpoints
Deformable DETR	12		50.6	config	OneDrive \| quark
Deformable DETR	24		51.9	config	OneDrive \| quark
DINO	12		51.1	config	OneDrive \| quark
DINO	24		51.9	config	OneDrive \| quark
Align DETR	12		52.1	config	-
Align DETR	24		52.7	config	-
Align DETR	12	√	52.7*	config	OneDrive \| quark
Align DETR	24	√	54.6*	config	OneDrive \| quark
Align DETR	36	√	55.2*	config	OneDrive \| quark
DDQ	12		52.6*	config	OneDrive \| quark
DDQ	24		53.3*	config	OneDrive \| quark
DDQ	12	√	53.0	config	OneDrive \| quark
DDQ	24	√	54.8*	config	OneDrive \| quark
DDQ	36	√	55.1*	config	OneDrive \| quark

Results on COCO with Swin-Large

Base Model	Epoch	w/LSJ	AP	Configs	Checkpoints
DDQ	12		58.7	config	OneDrive \| quark
DDQ	12	√	58.8*	config	OneDrive \| quark
DDQ	24	√	59.7*	config	OneDrive \| quark
Align DETR	12		58.6	config	OneDrive \| quark
Align DETR	24		59.3	config	OneDrive \| quark
Align DETR	12	√	58.8	config	OneDrive \| quark
Align DETR	24	√	59.6	config	OneDrive \| quark
Align DETR	36	√	60.0	config	OneDrive \| quark

* Retrained this configuration, the result is slightly higher than what we reported in the paper.

Installation

We test our models under python=3.10.10, pytorch=1.12.0,cuda=11.6. Other versions might be available as well.

Install Pytorch and torchvision

Follow the instruction on https://pytorch.org/get-started/locally/.

# an example:
conda install -c pytorch pytorch torchvision

Install other needed packages

pip install -r requirements.txt

Data

Please download COCO 2017 dataset and organize them as following:

coco2017/
  ├── train2017/
  ├── val2017/
  └── annotations/
  	├── instances_train2017.json
  	└── instances_val2017.json

Run

Modify COCO path in config file

Before training or evaluation, you need to modify the dataset path in following config files:

project/configs/_base_/datasets/data_re_aug_coco_detection.py
project/configs/_base_/datasets/lsj_data_re_aug_coco_detection.py

To train a model on single node

To accelerate convergence, we apply the SoCo pretrain on the ResNet-50 backbone (./backbone_pth/backbone.pth).

./dist_train.sh <Config Path> <GPU Number> <Work Dir>

To eval a model on single node

./dist_test.sh <Config Path> <Checkpoint Path> <GPU Number>

Multi-node training

You can refer to Deformable-DETR to enable training on multiple nodes.

Citation

If you use HPR in your research or wish to refer to the baseline results published here, please use the following BibTeX entry.

@inproceedings{zhao2024hybrid,
  title={Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective},
  author={Zhao, Jinjing and Wei, Fangyun and Xu, Chang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={17416--17426},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
backbone_pth		backbone_pth
images		images
project		project
LICENSE		LICENSE
README.md		README.md
dist_test.sh		dist_test.sh
dist_train.sh		dist_train.sh
paper.pdf		paper.pdf
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Proposal Refiner

TODO

Introduction

Main Results

Results on COCO with ResNet-50

Results on COCO with Swin-Large

Installation

Data

Run

Modify COCO path in config file

To train a model on single node

To eval a model on single node

Multi-node training

Citation

About

Releases

Packages

Languages

License

ZhaoJingjing713/HPR

Folders and files

Latest commit

History

Repository files navigation

Hybrid Proposal Refiner

TODO

Introduction

Main Results

Results on COCO with ResNet-50

Results on COCO with Swin-Large

Installation

Data

Run

Modify COCO path in config file

To train a model on single node

To eval a model on single node

Multi-node training

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages