This is the implementation of Learning Context-aware Classifier for Semantic Segmentation (AAAI 2023, Oral). This repo provides the implementation of CAC for 2D semantic segmentation.
CAC is also found effective in 3D semantic segmentation, achieving competitive performance against recent SOTA methods (e.g., boosting SpUNet to 76% mIoU on Scannet val), and the implementation is available at PointCept that is a powerful and flexible codebase for point cloud perception research.
This repo is built upon MMSegmentation. Many thanks to the contributors.
@misc{mmseg2020,
title={{MMSegmentation}: OpenMMLab Semantic Segmentation Toolbox and Benchmark},
author={MMSegmentation Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmsegmentation}},
year={2020}
}
- mmcv-full 1.5.0
- mmsegmentation 0.30.0
- timm 0.5.4
- numpy 1.21.0
- torch 1.9.1
- CUDA 11.1
For more details, please refer to the dependancy requirments of MMSegmentation.
The data preparation, training and testing strictly follows that of MMSegmentation, please refer to the document for more details.
For example, for training with config.py on 4 GPUs, please run:
./tool/dist_train.sh config.py 4
For testing with config.py on 4 GPUs, and the weights are loaded from model.pth, please run:
./tool/dist_test.sh config.py model.pth 4 --eval mIoU
We reproduce the results on ADE20K with this repo as examples for using the context-aware classifier. For reproducing the results on other benchmarks, please refer to the configurations of mmsegmentation and change the decoder heads accordingly. One example of FCN is:
We note that, except for UperNet with Swin-B and Swin-L trained with 8 gpus, the other models are trained with 4 gpus due to the limited computational resources, such that the batch-size used for training UperNet with Swin-T becomes 4 instead of the default value 2 for a fair comparison. This does not affect the baseline performance.
Method | Backbone | mIoU | config | download |
---|---|---|---|---|
FCN + CAC | MobileNet-V2 | 37.42 | config | model |
DeepLabV3Plus + CAC | R-50-D8 | 46.29 | config | model |
OCRNet + CAC | HRNetV2p-W18 | 44.53 | config | model |
UperNet + CAC | Swin-T | 46.91 | config | model |
UperNet + CAC | Swin-B (IN-22K) | 52.46 | config | model |
UperNet + CAC | Swin-L (IN-22K) | 52.75 | config | model |
If you find this project useful, please consider citing:
@InProceedings{tian2023cac,
title={Learning Context-aware Classifier for Semantic Segmentation},
author={Zhuotao Tian and Jiequan Cui and Li Jiang and Xiaojuan Qi and Xin Lai and Yixin Chen and Shu Liu and Jiaya Jia},
booktitle={Proceedings of the Thirty-Seventh {AAAI} Conference on Artificial Intelligence},
year={2023}
}