Skip to content

Latest commit

 

History

History
 
 

faster_rcnn

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Abstract

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks.

Results and Models

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
R-50-C4 caffe 1x - - 35.6 config model | log
R-50-DC5 caffe 1x - - 37.2 config model | log
R-50-FPN caffe 1x 3.8 37.8 config model | log
R-50-FPN pytorch 1x 4.0 21.4 37.4 config model | log
R-50-FPN (FP16) pytorch 1x 3.4 28.8 37.5 config model | log
R-50-FPN pytorch 2x - - 38.4 config model | log
R-101-FPN caffe 1x 5.7 39.8 config model | log
R-101-FPN pytorch 1x 6.0 15.6 39.4 config model | log
R-101-FPN pytorch 2x - - 39.8 config model | log
X-101-32x4d-FPN pytorch 1x 7.2 13.8 41.2 config model | log
X-101-32x4d-FPN pytorch 2x - - 41.2 config model | log
X-101-64x4d-FPN pytorch 1x 10.3 9.4 42.1 config model | log
X-101-64x4d-FPN pytorch 2x - - 41.6 config model | log

Different regression loss

We trained with R-50-FPN pytorch style backbone for 1x schedule.

Backbone Loss type Mem (GB) Inf time (fps) box AP Config Download
R-50-FPN L1Loss 4.0 21.4 37.4 config model | log
R-50-FPN IoULoss 37.9 config model | log
R-50-FPN GIoULoss 37.6 config model | log
R-50-FPN BoundedIoULoss 37.4 config model | log

Pre-trained Models

We also train some models with longer schedules and multi-scale training. The users could finetune them for downstream tasks.

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
R-50-C4 caffe 1x - 35.9 config model | log
R-50-DC5 caffe 1x - 37.4 config model | log
R-50-DC5 caffe 3x - 38.7 config model | log
R-50-FPN caffe 2x 3.7 39.7 config model | log
R-50-FPN caffe 3x 3.7 39.9 config model | log
R-50-FPN pytorch 3x 3.9 40.3 config model | log
R-101-FPN caffe 3x 5.6 42.0 config model | log
R-101-FPN pytorch 3x 5.8 41.8 config model | log
X-101-32x4d-FPN pytorch 3x 7.0 42.5 config model | log
X-101-32x8d-FPN pytorch 3x 10.1 42.4 config model | log
X-101-64x4d-FPN pytorch 3x 10.0 43.1 config model | log

We further finetune some pre-trained models on the COCO subsets, which only contain only a few of the 80 categories.

Backbone Style Class name Pre-traind model Mem (GB) box AP Config Download
R-50-FPN caffe person R-50-FPN-Caffe-3x 3.7 55.8 config model | log
R-50-FPN caffe person-bicycle-car R-50-FPN-Caffe-3x 3.7 44.1 config model | log

Torchvision New Receipe (TNR)

Torchvision released its high-precision ResNet models. The training details can be found on the Pytorch website. Here, we have done grid searches on learning rate and weight decay and found the optimal hyper-parameter on the detection task.

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
R-50-TNR pytorch 1x - 40.2 config model | log

Citation

@article{Ren_2017,
   title={Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks},
   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
   year={2017},
   month={Jun},
}