Skip to content

Commit ab12009

Browse files
authored
[Feature] Support BiSeNetV1 (#851)
* First Commit * fix typos * fix typos * Fix assertion bug * Adding Assert * Adding Unittest * Fixing typo * Uploading models & logs * Fixing unittest error * changing README.md * changing README.md
1 parent 2aa632e commit ab12009

14 files changed

+767
-1
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ Supported methods:
7575
- [x] [PSPNet (CVPR'2017)](configs/pspnet)
7676
- [x] [DeepLabV3 (ArXiv'2017)](configs/deeplabv3)
7777
- [x] [Mixed Precision (FP16) Training (ArXiv'2017)](configs/fp16)
78+
- [x] [BiSeNetV1 (ECCV'2018)](configs/bisenetv1)
7879
- [x] [PSANet (ECCV'2018)](configs/psanet)
7980
- [x] [DeepLabV3+ (CVPR'2018)](configs/deeplabv3plus)
8081
- [x] [UPerNet (ECCV'2018)](configs/upernet)

README_zh-CN.md

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ MMSegmentation 是一个基于 PyTorch 的语义分割开源工具箱。它是 O
7474
- [x] [PSPNet (CVPR'2017)](configs/pspnet)
7575
- [x] [DeepLabV3 (ArXiv'2017)](configs/deeplabv3)
7676
- [x] [Mixed Precision (FP16) Training (ArXiv'2017)](configs/fp16)
77+
- [x] [BiSeNetV1 (ECCV'2018)](configs/bisenetv1)
7778
- [x] [PSANet (ECCV'2018)](configs/psanet)
7879
- [x] [DeepLabV3+ (CVPR'2018)](configs/deeplabv3plus)
7980
- [x] [UPerNet (ECCV'2018)](configs/upernet)
+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# model settings
2+
norm_cfg = dict(type='SyncBN', requires_grad=True)
3+
model = dict(
4+
type='EncoderDecoder',
5+
backbone=dict(
6+
type='BiSeNetV1',
7+
in_channels=3,
8+
context_channels=(128, 256, 512),
9+
spatial_channels=(64, 64, 64, 128),
10+
out_indices=(0, 1, 2),
11+
out_channels=256,
12+
backbone_cfg=dict(
13+
type='ResNet',
14+
in_channels=3,
15+
depth=18,
16+
num_stages=4,
17+
out_indices=(0, 1, 2, 3),
18+
dilations=(1, 1, 1, 1),
19+
strides=(1, 2, 2, 2),
20+
norm_cfg=norm_cfg,
21+
norm_eval=False,
22+
style='pytorch',
23+
contract_dilation=True),
24+
norm_cfg=norm_cfg,
25+
align_corners=False,
26+
init_cfg=None),
27+
decode_head=dict(
28+
type='FCNHead',
29+
in_channels=256,
30+
in_index=0,
31+
channels=256,
32+
num_convs=1,
33+
concat_input=False,
34+
dropout_ratio=0.1,
35+
num_classes=19,
36+
norm_cfg=norm_cfg,
37+
align_corners=False,
38+
loss_decode=dict(
39+
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
40+
auxiliary_head=[
41+
dict(
42+
type='FCNHead',
43+
in_channels=128,
44+
channels=64,
45+
num_convs=1,
46+
num_classes=19,
47+
in_index=1,
48+
norm_cfg=norm_cfg,
49+
concat_input=False,
50+
align_corners=False,
51+
loss_decode=dict(
52+
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
53+
dict(
54+
type='FCNHead',
55+
in_channels=128,
56+
channels=64,
57+
num_convs=1,
58+
num_classes=19,
59+
in_index=2,
60+
norm_cfg=norm_cfg,
61+
concat_input=False,
62+
align_corners=False,
63+
loss_decode=dict(
64+
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
65+
],
66+
# model training and testing settings
67+
train_cfg=dict(),
68+
test_cfg=dict(mode='whole'))

configs/bisenetv1/README.md

+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
2+
3+
## Introduction
4+
5+
<!-- [ALGORITHM] -->
6+
7+
<a href="https://github.com/ycszen/TorchSeg/tree/master/model/bisenet">Official Repo</a>
8+
9+
<a href="https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv1.py#L266">Code Snippet</a>
10+
11+
<details>
12+
<summary align="right"><a href="https://arxiv.org/abs/1808.00897">BiSeNetV1 (ECCV'2018)</a></summary>
13+
14+
```latex
15+
@inproceedings{yu2018bisenet,
16+
title={Bisenet: Bilateral segmentation network for real-time semantic segmentation},
17+
author={Yu, Changqian and Wang, Jingbo and Peng, Chao and Gao, Changxin and Yu, Gang and Sang, Nong},
18+
booktitle={Proceedings of the European conference on computer vision (ECCV)},
19+
pages={325--341},
20+
year={2018}
21+
}
22+
```
23+
24+
</details>
25+
26+
## Results and models
27+
28+
### Cityscapes
29+
30+
| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
31+
| --------- | --------- | --------- | ------: | -------- | -------------- | ----: | ------------- | --------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32+
| BiSeNetV1 (No Pretrain) | R-18-D32 | 1024x1024 | 160000 | 5.69 | 31.77 | 74.44 | 77.05 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes_20210922_172239-c55e78e2.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes_20210922_172239.log.json) |
33+
| BiSeNetV1| R-18-D32 | 1024x1024 | 160000 | 5.69 | 31.77 | 74.37 | 76.91 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210905_220251-8ba80eff.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210905_220251.log.json) |
34+
| BiSeNetV1 (4x8) | R-18-D32 | 1024x1024 | 160000 | 11.17 | 31.77 | 75.16 | 77.24 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes_20210905_220322-bb8db75f.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes_20210905_220322.log.json) |
35+
| BiSeNetV1 (No Pretrain) | R-50-D32 | 1024x1024 | 160000 | 3.3 | 7.71 | 76.92 | 78.87 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes_20210923_222639-7b28a2a6.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes_20210923_222639.log.json) |
36+
| BiSeNetV1 | R-50-D32 | 1024x1024 | 160000 | 15.39 | 7.71 | 77.68 | 79.57 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210917_234628-8b304447.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210917_234628.log.json) |
37+
38+
Note:
39+
40+
- `4x8`: Using 4 GPUs with 8 samples per GPU in training.
41+
- Default setting is 4 GPUs with 4 samples per GPU in training.
42+
- `No Pretrain` means the model is trained from scratch.

configs/bisenetv1/bisenetv1.yml

+125
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
Collections:
2+
- Name: bisenetv1
3+
Metadata:
4+
Training Data:
5+
- Cityscapes
6+
Paper:
7+
URL: https://arxiv.org/abs/1808.00897
8+
Title: 'BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation'
9+
README: configs/bisenetv1/README.md
10+
Code:
11+
URL: https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv1.py#L266
12+
Version: v0.18.0
13+
Converted From:
14+
Code: https://github.com/ycszen/TorchSeg/tree/master/model/bisenet
15+
Models:
16+
- Name: bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes
17+
In Collection: bisenetv1
18+
Metadata:
19+
backbone: R-18-D32
20+
crop size: (1024,1024)
21+
lr schd: 160000
22+
inference time (ms/im):
23+
- value: 31.48
24+
hardware: V100
25+
backend: PyTorch
26+
batch size: 1
27+
mode: FP32
28+
resolution: (1024,1024)
29+
memory (GB): 5.69
30+
Results:
31+
- Task: Semantic Segmentation
32+
Dataset: Cityscapes
33+
Metrics:
34+
mIoU: 74.44
35+
mIoU(ms+flip): 77.05
36+
Config: configs/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes.py
37+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes_20210922_172239-c55e78e2.pth
38+
- Name: bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes
39+
In Collection: bisenetv1
40+
Metadata:
41+
backbone: R-18-D32
42+
crop size: (1024,1024)
43+
lr schd: 160000
44+
inference time (ms/im):
45+
- value: 31.48
46+
hardware: V100
47+
backend: PyTorch
48+
batch size: 1
49+
mode: FP32
50+
resolution: (1024,1024)
51+
memory (GB): 5.69
52+
Results:
53+
- Task: Semantic Segmentation
54+
Dataset: Cityscapes
55+
Metrics:
56+
mIoU: 74.37
57+
mIoU(ms+flip): 76.91
58+
Config: configs/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py
59+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210905_220251-8ba80eff.pth
60+
- Name: bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes
61+
In Collection: bisenetv1
62+
Metadata:
63+
backbone: R-18-D32
64+
crop size: (1024,1024)
65+
lr schd: 160000
66+
inference time (ms/im):
67+
- value: 31.48
68+
hardware: V100
69+
backend: PyTorch
70+
batch size: 1
71+
mode: FP32
72+
resolution: (1024,1024)
73+
memory (GB): 11.17
74+
Results:
75+
- Task: Semantic Segmentation
76+
Dataset: Cityscapes
77+
Metrics:
78+
mIoU: 75.16
79+
mIoU(ms+flip): 77.24
80+
Config: configs/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes.py
81+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes/bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes_20210905_220322-bb8db75f.pth
82+
- Name: bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes
83+
In Collection: bisenetv1
84+
Metadata:
85+
backbone: R-50-D32
86+
crop size: (1024,1024)
87+
lr schd: 160000
88+
inference time (ms/im):
89+
- value: 129.7
90+
hardware: V100
91+
backend: PyTorch
92+
batch size: 1
93+
mode: FP32
94+
resolution: (1024,1024)
95+
memory (GB): 3.3
96+
Results:
97+
- Task: Semantic Segmentation
98+
Dataset: Cityscapes
99+
Metrics:
100+
mIoU: 76.92
101+
mIoU(ms+flip): 78.87
102+
Config: configs/bisenetv1/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes.py
103+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes_20210923_222639-7b28a2a6.pth
104+
- Name: bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes
105+
In Collection: bisenetv1
106+
Metadata:
107+
backbone: R-50-D32
108+
crop size: (1024,1024)
109+
lr schd: 160000
110+
inference time (ms/im):
111+
- value: 129.7
112+
hardware: V100
113+
backend: PyTorch
114+
batch size: 1
115+
mode: FP32
116+
resolution: (1024,1024)
117+
memory (GB): 15.39
118+
Results:
119+
- Task: Semantic Segmentation
120+
Dataset: Cityscapes
121+
Metrics:
122+
mIoU: 77.68
123+
mIoU(ms+flip): 79.57
124+
Config: configs/bisenetv1/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py
125+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv1/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes/bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes_20210917_234628-8b304447.pth
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
_base_ = [
2+
'../_base_/models/bisenetv1_r18-d32.py',
3+
'../_base_/datasets/cityscapes_1024x1024.py',
4+
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
5+
]
6+
lr_config = dict(warmup='linear', warmup_iters=1000)
7+
optimizer = dict(lr=0.025)
8+
data = dict(
9+
samples_per_gpu=4,
10+
workers_per_gpu=4,
11+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
_base_ = [
2+
'../_base_/models/bisenetv1_r18-d32.py',
3+
'../_base_/datasets/cityscapes_1024x1024.py',
4+
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
5+
]
6+
model = dict(
7+
backbone=dict(
8+
backbone_cfg=dict(
9+
init_cfg=dict(
10+
type='Pretrained', checkpoint='open-mmlab://resnet18_v1c'))))
11+
lr_config = dict(warmup='linear', warmup_iters=1000)
12+
optimizer = dict(lr=0.025)
13+
data = dict(
14+
samples_per_gpu=4,
15+
workers_per_gpu=4,
16+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
_base_ = './bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py'
2+
data = dict(
3+
samples_per_gpu=8,
4+
workers_per_gpu=8,
5+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
_base_ = [
2+
'../_base_/models/bisenetv1_r18-d32.py',
3+
'../_base_/datasets/cityscapes_1024x1024.py',
4+
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
5+
]
6+
norm_cfg = dict(type='SyncBN', requires_grad=True)
7+
model = dict(
8+
type='EncoderDecoder',
9+
backbone=dict(
10+
type='BiSeNetV1',
11+
context_channels=(512, 1024, 2048),
12+
spatial_channels=(256, 256, 256, 512),
13+
out_channels=1024,
14+
backbone_cfg=dict(
15+
init_cfg=dict(
16+
type='Pretrained', checkpoint='open-mmlab://resnet50_v1c'),
17+
type='ResNet',
18+
depth=50)),
19+
decode_head=dict(
20+
type='FCNHead', in_channels=1024, in_index=0, channels=1024),
21+
auxiliary_head=[
22+
dict(
23+
type='FCNHead',
24+
in_channels=512,
25+
channels=256,
26+
num_convs=1,
27+
num_classes=19,
28+
in_index=1,
29+
norm_cfg=norm_cfg,
30+
concat_input=False),
31+
dict(
32+
type='FCNHead',
33+
in_channels=512,
34+
channels=256,
35+
num_convs=1,
36+
num_classes=19,
37+
in_index=2,
38+
norm_cfg=norm_cfg,
39+
concat_input=False),
40+
])
41+
lr_config = dict(warmup='linear', warmup_iters=1000)
42+
optimizer = dict(lr=0.05)
43+
data = dict(
44+
samples_per_gpu=4,
45+
workers_per_gpu=4,
46+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
_base_ = './bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes.py'
2+
model = dict(
3+
type='EncoderDecoder',
4+
backbone=dict(
5+
backbone_cfg=dict(
6+
init_cfg=dict(
7+
type='Pretrained', checkpoint='open-mmlab://resnet50_v1c'))))

mmseg/models/backbones/__init__.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Copyright (c) OpenMMLab. All rights reserved.
2+
from .bisenetv1 import BiSeNetV1
23
from .bisenetv2 import BiSeNetV2
34
from .cgnet import CGNet
45
from .fast_scnn import FastSCNN
@@ -16,5 +17,6 @@
1617
__all__ = [
1718
'ResNet', 'ResNetV1c', 'ResNetV1d', 'ResNeXt', 'HRNet', 'FastSCNN',
1819
'ResNeSt', 'MobileNetV2', 'UNet', 'CGNet', 'MobileNetV3',
19-
'VisionTransformer', 'SwinTransformer', 'MixVisionTransformer', 'BiSeNetV2'
20+
'VisionTransformer', 'SwinTransformer', 'MixVisionTransformer',
21+
'BiSeNetV1', 'BiSeNetV2'
2022
]

0 commit comments

Comments
 (0)