The official implementation of Bucketed Ranking-based Losses. Our implementation is based on mmdetection.
Bucketed Ranking-based Losses for Efficient Training of Object Detectors,
Feyza Yavuz, Baris Can Cam, Adnan Harun Dogan, Kemal Oksuz, Emre Akbas, Sinan Kalkan, ECCV 2024. (arXiv pre-print)
What is Bucketed Ranking-based (BR) Losses? Bucketing for ranking-based losses enhances the efficiency of such losses in object detection by grouping negative predictions into buckets, significantly reducing the number of pairwise comparisons required during training. Bucketing maintains the alignment with evaluation criteria and robustness against class imbalance of ranking-based loss functions while drastically improving the time complexity.
BRS-DETR: Efficient and Robust Transformer-Based Object Detection with Bucketed Ranking-Based Losses BRS-DETR integrates Bucketed Ranking-Based Loss (BRS Loss) into Co-DETR, delivering superior performance and training efficiency on the COCO benchmark. (i) BRS-DETR achieves a 0.8 AP improvement on ResNet-50 and consistent gains across other transformer-based backbones. (ii) BRS-DETR provides faster training: cuts training time by 6×, optimizing the handling of positive examples and loss calculation of auxillary heads.
Benefits of BR Loss on Efficiency and Simplification of Training. With BR Loss, we achieve significant improvements in training efficiency: (i) The bucketed approach reduces the time complexity to O(max(N log(N),P²)), allowing faster training, (ii) BR Loss maintains the simplicity and robustness of ranking-based approaches without requiring complex sampling heuristics or additional auxiliary heads, and (iii) it enables efficient training of large-scale object detectors, including transformer-based models, with minimal tuning.
Benefits of BR Loss on Improving Performance. Using BR Loss, we train seven diverse visual detectors and demonstrate consistent performance improvements: (i) BR Loss accelerates training by 2× on average while preserving the accuracy of unbucketed versions, (ii) For the first time, we successfully train transformer-based detectors like CoDETR using ranking-based losses, consistently outperforming their original configurations across multiple backbones.
Please cite the paper if you benefit from our paper or the repository:
@inproceedings{BRLoss,
title = {Bucketed Ranking-based Losses for Efficient Training of Object Detectors},
author = {Feyza Yavuz and Baris Can Cam and Adnan Harun Dogan and Kemal Oksuz and Emre Akbas and Sinan Kalkan},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2024}
}
- Please see get_started.md for requirements and installation of mmdetection.
- Please see introduction.md for dataset preparation and basic usage of mmdetection.
Please note that, we implement our method on MMDetection V2.25.3 and MMCV V1.5.0. More specifically, we use python=3.7.11, pytorch=1.11.0, cuda=11.3
versions.
Here, we report validation set results for object detection and instance segmentation tasks. For object detection we report results on COCO validation set. For instance segmentation we report results on both Cityscapes and LVIS validation sets.
We refer to the RS Loss repository for models trained with RS Loss.
Backbone | Epoch | Detector | box AP | Log | Config | Model |
---|---|---|---|---|---|---|
ResNet-50 | 12 | Co-DETR | 49.3 | log | config | model |
ResNet-50 | 12 | BRS-DETR | 50.1 | log | config | model |
Swin-T | 12 | Co-DETR | 51.7 | log | config | model |
Swin-T | 12 | BRS-DETR | 52.3 | log | config | model |
Swin-L | 12 | Co-DETR | 56.9 | log | config | model |
Swin-L | 12 | BRS-DETR | 57.2 | log | config | model |
Backbone | Epoch | Loss Func. | Time | box AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | RS | 0.58 | 39.5 | log | config | model |
ResNet-50 | 12 | BRS | 0.19 (3.0x ↓) | 39.5 | log | config | model |
ResNet-101 | 36 | RS | 0.91 | 47.3 | log | config | model |
ResNet-101 | 36 | BRS | 0.47 (2.0x ↓) | 47.7 | log | config | model |
Backbone | Epoch | Loss Func. | Time | box AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | RS | 1.54 | 41.1 | log | config | model |
ResNet-50 | 12 | BRS | 0.29 (5.3x ↓) | 41.1 | log | config | model |
Backbone | Epoch | Loss Func. | Time | box AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | AP | 0.34 | 38.3 | log | config | model |
ResNet-50 | 12 | BAP | 0.18 (1.9x ↓) | 38.5 | log | config | model |
ResNet-50 | 12 | RS | 0.44 | 39.8 | log | config | model |
ResNet-50 | 12 | BRS | 0.19 (2.4x ↓) | 39.8 | log | config | model |
Backbone | Epoch | Loss Func. | Time | box AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | AP | TODO | 37.3 | log | config | model |
ResNet-50 | 12 | BAP | TODO (1.5x ↓) | 37.2 | log | config | model |
ResNet-50 | 12 | RS | TODO | 40.8 | log | config | model |
ResNet-50 | 12 | BRS | 0.36 (1.9x ↓) | 40.8 | log | config | model |
We use Mask R-CNN as the baseline model to experiment with our method in the instance segmentation task.
Backbone | Epoch | Loss Func. | Time | mask AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | RS | 0.68 | 36.3 | log | config | model |
ResNet-50 | 12 | BRS | 0.29 (2.3x ↓) | 36.2 | log | config | model |
ResNet-101 | 36 | RS | 0.71 | 40.2 | log | config | model |
ResNet-101 | 36 | BRS | 0.33 (2.2x ↓) | 40.3 | log | config | model |
Backbone | Epoch | Loss Func. | Time | box AP | mask AP | Log | Config | Model |
---|---|---|---|---|---|---|---|---|
ResNet-50 | 12 | RS | 0.43 | 43.7 | 38.2 | log | config | model |
ResNet-50 | 12 | BRS | 0.19 (2.3x ↓) | 43.3 | 38.5 | log | config | model |
Backbone | Epoch | Loss Func. | Time | mask AP | Log | Config | Model |
---|---|---|---|---|---|---|---|
ResNet-50 | 12 | RS | 0.87 | 25.6 | log | config | model |
ResNet-50 | 12 | BRS | 0.35 (2.5x ↓) | 25.8 | log | config | model |