Skip to content

[New Models: adding PyTorch TorchVision's MaskRCNN_ResNet50_FPN_V2, FasterRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2]  #9653

@medphisiker

Description

@medphisiker

Models description

Hello.

there is new intresting version of Masked RCNN model in TorchVision (link).
maskrcnn_resnet50_fpn_v2 - Improved Mask R-CNN model with a ResNet-50-FPN backbone from the Benchmarking Detection Transfer Learning with Vision Transformers paper.

maskrcnn_resnet50_fpn_v2 model gives effective increase(link) for MS COCO metric in comparision with classic maskrcnn_resnet50_fpn.

image

I see some examples of fine tuning. The code for fine tuning maskrcnn_resnet50_fpn_v2 and maskrcnn_resnet50_fpn are identical.
MMDetection framework has support for classic TorchVision's maskrcnn_resnet50_fpn fine tuning. It will be great if MMDetection framework also has support for new TorchVision's maskrcnn_resnet50_fpn_v2.

Describe the solution you'd like
It will be great if MMDetection framework also has support for new TorchVision's maskrcnn_resnet50_fpn_v2. Also there is an updated version of the these detectors, - FasterRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2.

image

P.S.
Currently, we already have many excellent neural networks for detection in the MMDetection framework. But it is important that Faster and Masked RCN are multi-stage detectors. Most of the more accurate semi real-time detectors are single-stage.

In one competition, I used YOLOv7, which had a higher metric on MS COCO for detection (53). But the competitors that used the classic multistage Faster R-CNN won that gives only 37. It turned out that on a dataset with crowded objects, Faster RCNN works better than a single-stage YOLOv7, even though there is a big difference in metrics on MS COCO in the YOLOv7 slider.

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

Improved Mask R-CNN v2 model with a ResNet-50-FPN backbone describes in the Benchmarking Detection Transfer Learning with Vision Transformers paper.
We have implementation of this model in PyTorch TorchVision (link).
There are [MaskRCNN_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.maskrcnn_resnet50_fpn_v2.html#torchvision.models.detection.MaskRCNN_ResNet50_FPN_V2_Weights).
There is link to merge request (pytorch/vision#5773).
It seems that @datumbox is the author of the code.

Constructs an improved Faster R-CNN v2 model with a ResNet-50-FPN backbone from Benchmarking Detection Transfer Learning with Vision Transformers paper.
We have implementation of this model in PyTorch TorchVision (link).
There are [FasterRCNN_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn_v2.html#torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights).
There is link to merge request (pytorch/vision#5763).
It seems that @datumbox is the author of the code.

There is no such information about RetinaNet_ResNet50_FPN_V2, but I think that TorchVision's developers create it by the same principle.
We have implementation of this model in PyTorch TorchVision (link).
There are [RetinaNet_ResNet50_FPN_V2_Weights.COCO_V1] in torchvision too (https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.retinanet_resnet50_fpn_v2.html#torchvision.models.detection.RetinaNet_ResNet50_FPN_V2_Weights)
There is link to merge request (pytorch/vision#5756).
It seems that @datumbox is the author of the code.

As I understand on the same principle @datumbox created FasterRCNN_ResNet50_FPN_V2, MaskRCNN_ResNet50_FPN_V2 and RetinaNet_ResNet50_FPN_V2.
Perhaps you can improve the rest of the backbones that are available for these architectures in MMDetection =)
That would be just super )

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions