Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen
[arXiv
] [BibTeX
] [Reference implementation
]
Install Detectron2 following the instructions. To use cityscapes, prepare data follow the tutorial.
To train a model with 8 GPUs run:
cd /path/to/detectron2/projects/Panoptic-DeepLab
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv.yaml --num-gpus 8
Model evaluation can be done similarly:
cd /path/to/detectron2/projects/Panoptic-DeepLab
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint
If you want to benchmark the network speed without post-processing, you can run the evaluation script with MODEL.PANOPTIC_DEEPLAB.BENCHMARK_NETWORK_SPEED True
:
cd /path/to/detectron2/projects/Panoptic-DeepLab
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint MODEL.PANOPTIC_DEEPLAB.BENCHMARK_NETWORK_SPEED True
Cityscapes models are trained with ImageNet pretraining.
Method | Backbone | Output resolution |
PQ | SQ | RQ | mIoU | AP | Memory (M) | model id | download |
---|---|---|---|---|---|---|---|---|---|---|
Panoptic-DeepLab | R50-DC5 | 1024×2048 | 58.6 | 80.9 | 71.2 | 75.9 | 29.8 | 8668 | - | model | metrics |
Panoptic-DeepLab | R52-DC5 | 1024×2048 | 60.3 | 81.5 | 72.9 | 78.2 | 33.2 | 9682 | 30841561 | model | metrics |
Panoptic-DeepLab (DSConv) | R52-DC5 | 1024×2048 | 60.3 | 81.0 | 73.2 | 78.7 | 32.1 | 10466 | 33148034 | model | metrics |
Note:
- R52: a ResNet-50 with its first 7x7 convolution replaced by 3 3x3 convolutions. This modification has been used in most semantic segmentation papers. We pre-train this backbone on ImageNet using the default recipe of pytorch examples.
- DC5 means using dilated convolution in
res5
. - We use a smaller training crop size (512x1024) than the original paper (1025x2049), we find using larger crop size (1024x2048) could further improve PQ by 1.5% but also degrades AP by 3%.
- The implementation with regular Conv2d in ASPP and head is much heavier head than the original paper.
- This implementation does not include optimized post-processing code needed for deployment. Post-processing the network outputs now takes similar amount of time to the network itself. Please refer to speed in the original paper for comparison.
- DSConv refers to using DepthwiseSeparableConv2d in ASPP and decoder. The implementation with DSConv is identical to the original paper.
COCO models are trained with ImageNet pretraining on 16 V100s.
Method | Backbone | Output resolution |
PQ | SQ | RQ | Box AP | Mask AP | Memory (M) | model id | download |
---|---|---|---|---|---|---|---|---|---|---|
Panoptic-DeepLab (DSConv) | R52-DC5 | 640×640 | 35.5 | 77.3 | 44.7 | 18.6 | 19.7 | 246448865 | model | metrics |
Note:
- R52: a ResNet-50 with its first 7x7 convolution replaced by 3 3x3 convolutions. This modification has been used in most semantic segmentation papers. We pre-train this backbone on ImageNet using the default recipe of pytorch examples.
- DC5 means using dilated convolution in
res5
. - This reproduced number matches the original paper (35.5 vs. 35.1 PQ).
- This implementation does not include optimized post-processing code needed for deployment. Post-processing the network outputs now takes more time than the network itself. Please refer to speed in the original paper for comparison.
- DSConv refers to using DepthwiseSeparableConv2d in ASPP and decoder.
If you use Panoptic-DeepLab, please use the following BibTeX entry.
- CVPR 2020 paper:
@inproceedings{cheng2020panoptic,
title={Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation},
author={Cheng, Bowen and Collins, Maxwell D and Zhu, Yukun and Liu, Ting and Huang, Thomas S and Adam, Hartwig and Chen, Liang-Chieh},
booktitle={CVPR},
year={2020}
}
- ICCV 2019 COCO-Mapillary workshp challenge report:
@inproceedings{cheng2019panoptic,
title={Panoptic-DeepLab},
author={Cheng, Bowen and Collins, Maxwell D and Zhu, Yukun and Liu, Ting and Huang, Thomas S and Adam, Hartwig and Chen, Liang-Chieh},
booktitle={ICCV COCO + Mapillary Joint Recognition Challenge Workshop},
year={2019}
}