Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tools] Add vit/swin/mit convert weight scripts #783

Merged
merged 16 commits into from
Aug 18, 2021
18 changes: 9 additions & 9 deletions configs/segformer/readme.md → configs/segformer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@

Evaluation with AlignedResize:

| Method | Backbone | Crop Size | Lr schd | mIoU | mIoU(ms+flip) |
| ------ | -------- | --------- | ------: | ---: | ------------- |
|Segformer | MIT-B0 | 512x512 | 160000 | 38.1 | 38.57 |
|Segformer | MIT-B1 | 512x512 | 160000 | 41.64 | 42.76 |
|Segformer | MIT-B2 | 512x512 | 160000 | 46.53 | 47.49 |
|Segformer | MIT-B3 | 512x512 | 160000 | 48.46 | 49.14 |
|Segformer | MIT-B4 | 512x512 | 160000 | 49.34 | 50.29 |
|Segformer | MIT-B5 | 512x512 | 160000 | 50.08 | 50.72 |
|Segformer | MIT-B5 | 640x640 | 160000 | 50.58 | 50.8 |
| Method | Backbone | Crop Size | Lr schd | mIoU | mIoU(ms+flip) |
| ------ | -------- | --------- | ------: | ---: | ------------- |
|Segformer | MIT-B0 | 512x512 | 160000 | 38.1 | 38.57 |
|Segformer | MIT-B1 | 512x512 | 160000 | 41.64 | 42.76 |
|Segformer | MIT-B2 | 512x512 | 160000 | 46.53 | 47.49 |
|Segformer | MIT-B3 | 512x512 | 160000 | 48.46 | 49.14 |
|Segformer | MIT-B4 | 512x512 | 160000 | 49.34 | 50.29 |
|Segformer | MIT-B5 | 512x512 | 160000 | 50.08 | 50.72 |
|Segformer | MIT-B5 | 640x640 | 160000 | 50.58 | 50.8 |

We replace `AlignedResize` in original implementatiuon to `Resize + ResizeToMultiple`. If you want to test by
using `AlignedResize`, you can change the dataset pipeline like this:
Expand Down
160 changes: 160 additions & 0 deletions configs/segformer/segformer.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
Collections:
- Name: segformer
Metadata:
Training Data:
- ADE20k
Models:
- Name: segformer_mit-b0_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B0
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 19.49
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 2.1
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 37.41
mIoU(ms+flip): 38.34
Config: configs/segformer/segformer_mit-b0_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b0_512x512_160k_ade20k/segformer_mit-b0_512x512_160k_ade20k_20210726_101530-8ffa8fda.pth
- Name: segformer_mit-b1_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B1
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 20.98
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 2.6
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 40.97
mIoU(ms+flip): 42.54
Config: configs/segformer/segformer_mit-b1_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b1_512x512_160k_ade20k/segformer_mit-b1_512x512_160k_ade20k_20210726_112106-d70e859d.pth
- Name: segformer_mit-b2_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B2
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 32.38
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 3.6
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 45.58
mIoU(ms+flip): 47.03
Config: configs/segformer/segformer_mit-b2_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b2_512x512_160k_ade20k/segformer_mit-b2_512x512_160k_ade20k_20210726_112103-cbd414ac.pth
- Name: segformer_mit-b3_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B3
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 45.23
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 4.8
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 47.82
mIoU(ms+flip): 48.81
Config: configs/segformer/segformer_mit-b3_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b3_512x512_160k_ade20k/segformer_mit-b3_512x512_160k_ade20k_20210726_081410-962b98d2.pth
- Name: segformer_mit-b4_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B4
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 64.72
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 6.1
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 48.46
mIoU(ms+flip): 49.76
Config: configs/segformer/segformer_mit-b4_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b4_512x512_160k_ade20k/segformer_mit-b4_512x512_160k_ade20k_20210728_183055-7f509d7d.pth
- Name: segformer_mit-b5_512x512_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B5
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 84.1
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
memory (GB): 7.2
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 49.13
mIoU(ms+flip): 50.22
Config: configs/segformer/segformer_mit-b5_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_512x512_160k_ade20k/segformer_mit-b5_512x512_160k_ade20k_20210726_145235-94cedf59.pth
- Name: segformer_mit-b5_640x640_160k_ade20k
In Collection: segformer
Metadata:
backbone: MIT-B5
crop size: (640,640)
lr schd: 160000
inference time (ms/im):
- value: 88.5
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (640,640)
memory (GB): 11.5
Results:
Task: Semantic Segmentation
Dataset: ADE20k
Metrics:
mIoU: 49.62
mIoU(ms+flip): 50.36
Config: configs/segformer/segformer_mit-b5_640x640_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_640x640_160k_ade20k/segformer_mit-b5_640x640_160k_ade20k_20210801_121243-41d2845b.pth
30 changes: 30 additions & 0 deletions docs/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,36 @@ Examples:
python tools/analyze_logs.py log.json --keys loss --legend loss
```

### Model conversion

`tools/model_converters/` provide several scripts to convert pretrain models released by other repos to MMSegmentation style.

#### ViT Swin MiT Transformer Models

- ViT

`tools/model_converters/vit2mmseg.py` convert keys in timm pretrained vit models to MMSegmentation style.

```shell
python tools/model_converters/vit2mmseg.py ${SRC} ${DST}
```

- Swin

`tools/model_converters/swin2mmseg.py` convert keys in official pretrained swin models to MMSegmentation style.

```shell
python tools/model_converters/swin2mmseg.py ${SRC} ${DST}
xvjiarui marked this conversation as resolved.
Show resolved Hide resolved
```

- Segformer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Segformer
- SegFormer


`tools/model_converters/mit2mmseg.py` convert keys in official pretrained mit models to MMSegmentation style.

```shell
python tools/model_converters/mit2mmseg.py ${SRC} ${DST}
```

## Model Serving

In order to serve an `MMSegmentation` model with [`TorchServe`](https://pytorch.org/serve/), you can follow the steps:
Expand Down
24 changes: 24 additions & 0 deletions docs_zh-CN/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,30 @@ python tools/analyze_logs.py xxx.log.json [--keys ${KEYS}] [--legend ${LEGEND}]
python tools/analyze_logs.py log.json --keys loss --legend loss
```

### 转换其他仓库的权重

`tools/model_converters/` 提供了若干个预训练权重转换脚本,支持将其他仓库的预训练权重的 key 转换为与 MMSegmentation 相匹配的 key。

#### ViT Swin MiT Transformer 模型

xvjiarui marked this conversation as resolved.
Show resolved Hide resolved
- ViT

```shell
python tools/model_converters/vit2mmseg.py ${SRC} ${DST}
xvjiarui marked this conversation as resolved.
Show resolved Hide resolved
```

- Swin

```shell
python tools/model_converters/swin2mmseg.py ${SRC} ${DST}
xvjiarui marked this conversation as resolved.
Show resolved Hide resolved
```

- MiT

```shell
python tools/model_converters/mit2mmseg.py ${SRC} ${DST}
xvjiarui marked this conversation as resolved.
Show resolved Hide resolved
```

## 模型服务

为了用 [`TorchServe`](https://pytorch.org/serve/) 服务 `MMSegmentation` 的模型 , 您可以遵循如下流程:
Expand Down
1 change: 1 addition & 0 deletions model-index.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Import:
- configs/psanet/psanet.yml
- configs/pspnet/pspnet.yml
- configs/resnest/resnest.yml
- configs/segformer/segformer.yml
- configs/sem_fpn/sem_fpn.yml
- configs/setr/setr.yml
- configs/swin/swin.yml
Expand Down
4 changes: 2 additions & 2 deletions tools/model_converters/mit_convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import torch


def mit_convert(ckpt):
def convert_mit(ckpt):
new_ckpt = OrderedDict()
# Process the concat between q linear weights and kv linear weights
for k, v in ckpt.items():
Expand Down Expand Up @@ -72,5 +72,5 @@ def parse_args():

ckpt = torch.load(src_path, map_location='cpu')

ckpt = mit_convert(ckpt)
ckpt = convert_mit(ckpt)
torch.save(ckpt, dst_path)
83 changes: 83 additions & 0 deletions tools/model_converters/swin2mmseg.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import argparse
from collections import OrderedDict

import torch


def convert_swin(ckpt):
new_ckpt = OrderedDict()

def correct_unfold_reduction_order(x):
out_channel, in_channel = x.shape
x = x.reshape(out_channel, 4, in_channel // 4)
x = x[:, [0, 2, 1, 3], :].transpose(1,
2).reshape(out_channel, in_channel)
return x

def correct_unfold_norm_order(x):
in_channel = x.shape[0]
x = x.reshape(4, in_channel // 4)
x = x[[0, 2, 1, 3], :].transpose(0, 1).reshape(in_channel)
return x

for k, v in ckpt.items():
if k.startswith('head'):
continue
elif k.startswith('layers'):
new_v = v
if 'attn.' in k:
new_k = k.replace('attn.', 'attn.w_msa.')
elif 'mlp.' in k:
if 'mlp.fc1.' in k:
new_k = k.replace('mlp.fc1.', 'ffn.layers.0.0.')
elif 'mlp.fc2.' in k:
new_k = k.replace('mlp.fc2.', 'ffn.layers.1.')
else:
new_k = k.replace('mlp.', 'ffn.')
elif 'downsample' in k:
new_k = k
if 'reduction.' in k:
new_v = correct_unfold_reduction_order(v)
elif 'norm.' in k:
new_v = correct_unfold_norm_order(v)
else:
new_k = k
new_k = new_k.replace('layers', 'stages', 1)
elif k.startswith('patch_embed'):
new_v = v
if 'proj' in k:
new_k = k.replace('proj', 'projection')
else:
new_k = k
else:
new_v = v
new_k = k

new_ckpt[new_k] = new_v

return new_ckpt


def main():
parser = argparse.ArgumentParser(
description='Convert keys in official pretrained swin models to'
'MMSegmentation style.')
parser.add_argument('src', help='src segmentation model path')
# The dst path must be a full path of the new checkpoint.
parser.add_argument('dst', help='save path')
args = parser.parse_args()

checkpoint = torch.load(args.src, map_location='cpu')
if 'state_dict' in checkpoint:
state_dict = checkpoint['state_dict']
elif 'model' in checkpoint:
state_dict = checkpoint['model']
else:
state_dict = checkpoint
weight = convert_swin(state_dict)
with open(args.dst, 'wb') as f:
torch.save(weight, f)


if __name__ == '__main__':
main()
Loading