Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Object Detection #340

Closed
oke-aditya opened this issue Nov 6, 2020 · 2 comments · Fixed by #475
Closed

Enhancing Object Detection #340

oke-aditya opened this issue Nov 6, 2020 · 2 comments · Fixed by #475
Labels
enhancement New feature or request help wanted Extra attention is needed
Milestone

Comments

@oke-aditya
Copy link
Contributor

oke-aditya commented Nov 6, 2020

🚀 Feature

As discussed. The following can be added.

F1. Support for backbones with FPNs. There is torchvision resnet_fpn_support which we can use directly.
F2. Support for Non FPN backbones. I have a hacky way to do this which won't support trainable_backbone_layers_parameter

Motivation

  1. Orignally FRCNN Did not have FPN. It used VGG Backbone. So probably for legacy purposes we can keep it.
  2. Currently it is hardcoded to use resnet_50. But with F1 it would be possible to use resnet101, resnext, etc.

Pitch

There are 2 ways to support this, I feel both are equally good it's a matter of design choice.

Pitch 1 (A simpler interface to user)

  1. Just add an extra init parameter to backbone, we handle things internally.
class lit_frcnn(pl.LightningModule):
    """
    Creates a Faster CNN which can be fine-tuned.
    """

    def __init__(self, learning_rate: float = 0.0001, num_classes: int = 91,
                 pretrained: bool = False, backbone: str = None, fpn: bool = True,
                 pretrained_backbone: bool = True, trainable_backbone_layers: int = 3,
                 replace_head: bool = True, **kwargs, ):
        """
        Args:
            learning_rate: the learning rate
            num_classes: number of detection classes (including background)
            pretrained: if true, returns a model pre-trained on COCO train2017
            pretrained_backbone: if true, returns a model with backbone pre-trained on Imagenet
            trainable_backbone_layers: number of trainable resnet layers starting from final block
        """
        super().__init__()
        self.learning_rate = learning_rate
        self.num_classes = num_classes
        self.backbone = backbone
        if backbone is None:
            self.model = fasterrcnn_resnet50_fpn(pretrained=pretrained,
                                                 pretrained_backbone=pretrained_backbone,
                                                 trainable_backbone_layers=trainable_backbone_layers,)

            in_features = self.model.roi_heads.box_predictor.cls_score.in_features
            self.model.roi_heads.box_predictor = FastRCNNPredictor(in_features, self.num_classes)

        else:
            backbone_model = create_fastercnn_backbone(self.backbone, fpn, pretrained_backbone,
                                                       trainable_backbone_layers, **kwargs)
            self.model = FasterRCNN(backbone_model, num_classes=num_classes, **kwargs)

Where we define create_fastercnn_backbone function. This involves handling both resnet_fpn_backbone and a very hacky way of non_fpns. We can support backbone as nn.Module too, this will involve additional if statement.

This is kind of parameter overloading might not be very wise, but does the job with simple APIs.

Pitch 2 (Little bit pythonic but slightly complex)

Use a staticmethod to create backbone and allow to pass Backbone as nn.Module
This involves 2 steps but some people consider this more organized.
API would be.

backbone = lit_frcnn.create_backbone("resnet50", fpn=True, pretrained_backbone=True, trainable_backbone_layers=3)
frcnn_model = lit_frcnn(learning_rate=0.0001, num_classes=91, backbone=backbone, ...)

This is tricky API. But I'm not sure what convention does lightning follow.

1 is simpler to user. We can re-use it, as it is a function which can be outside the model file.
2 is slightly involved, also couples this part of backbone with FRCNN, thus for newer models we would need this static method to be defined again. (E.g. for retinanet)

Alternatives

I think the other way is to start creating abstract classes and inheriting them for such stuff. But again this makes API very complex and hard to maintain. People would be bound to use them over and over.

IMHO

  1. If we are decoupling all the models. 1 is far better than 2, we might not have to rewrite these backbones again.
  2. ALternative 2 is easy on the eyes, but the caveats are found later, this starts OOPs into bolts which kind of starts creating abstractions and inconsistent interfaces.

Additional context

Hmm, I am considering to couple this with #286 . The Non backbone code can be used in transfer learning as well. (I have made it work)

cc @ananyahjha93 @Borda

@oke-aditya oke-aditya added enhancement New feature or request help wanted Extra attention is needed labels Nov 6, 2020
@oke-aditya oke-aditya mentioned this issue Nov 6, 2020
4 tasks
@briankosw
Copy link
Contributor

I think option 1 is better as well and I think it's better aligned with the design principles of PyTorch Lightning (and I think it makes a lot of sense w.r.t. #286 as you mentioned). I would love to get involved in this effort. What's the status on this so far?

@oke-aditya
Copy link
Contributor Author

Unsure if it would go to bolts. But this API is implemented in Quickvision !

@Borda Borda added this to the v0.3 milestone Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants