Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add configs for vit backbone plus decode_heads #520

Merged
merged 39 commits into from
Jul 1, 2021

Conversation

xiexinch
Copy link
Collaborator

No description provided.

@codecov
Copy link

codecov bot commented Apr 26, 2021

Codecov Report

Merging #520 (8db5cdd) into master (98067be) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #520      +/-   ##
==========================================
+ Coverage   85.77%   85.83%   +0.05%     
==========================================
  Files         103      103              
  Lines        5307     5308       +1     
  Branches      857      858       +1     
==========================================
+ Hits         4552     4556       +4     
+ Misses        583      581       -2     
+ Partials      172      171       -1     
Flag Coverage Δ
unittests 85.81% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmseg/models/necks/multilevel_neck.py 100.00% <100.00%> (ø)
mmseg/datasets/builder.py 89.33% <0.00%> (+3.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98067be...8db5cdd. Read the comment docs.

norm_cfg=dict(type='LN'),
act_cfg=dict(type='GELU'),
norm_eval=False),
neck=dict(type='MultiLevelNeck', in_channels=[768], out_channels=768),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may use [0.5, 1, 2, 4] scale.

Comment on lines 5 to 6
pretrained='https://github.com/rwightman/pytorch-image-models/releases/\
download/v0.1-vitjx/jx_vit_base_p16_224-80ecf9dd.pth',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also add deit-s and deit-b

@@ -0,0 +1,54 @@
# model settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config needs to be renamed.

qk_scale=None,
drop_rate=0.0,
attn_drop_rate=0.0,
norm_cfg=dict(type='LN'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #524eps=1e-6.

Comment on lines 23 to 24
'absolute_pos_embed': dict(decay_mult=0.),
'relative_position_bias_table': dict(decay_mult=0.),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These keys are not in the ViT. They should be pos_embed and cls_token.

# By default, models are trained on 8 GPUs with 2 images per GPU
data = dict(samples_per_gpu=2)

find_unused_parameters = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may remove these since it will slow down the training and make bugs hard to be found.


| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| UPerNet | Vit | 512x1024 | 40000 | | | | | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ViT-B.

| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| UPerNet | Vit | 512x1024 | 40000 | | | | | |
| UPerNet | Deit-S | 512x1024 | 40000 | | | | | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeiT

Comment on lines 39 to 41
| UPerNet | Vit | 512x512 | 80000 | | | | | |
| UPerNet | Deit-S | 512x512 | 80000 | | | | | |
| UPerNet | Deit-B | 512x512 | 80000 | | | | | |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also add 160k.

Comment on lines 5 to 6
pretrained='https://github.com/rwightman/pytorch-image-models/releases/\
download/v0.1-vitjx/jx_vit_base_p16_224-80ecf9dd.pth',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may use # noqa to bypass the line breaking.

@@ -0,0 +1,55 @@
# model settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to ***_vit-b16.py

@@ -0,0 +1,36 @@
_base_ = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to ***_deit-b16.py

]

model = dict(
pretrained='https://dl.fbaipublicfiles.com/deit/\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# noqa

@clownrat6 clownrat6 mentioned this pull request May 18, 2021
@xvjiarui
Copy link
Collaborator

Are results updated?

@xiexinch
Copy link
Collaborator Author

Are results updated?

Not yet, these configs are not correct for the latest vit, I need some time to modify them and test the checkpoints.
I will update soo...

@Junjun2016 Junjun2016 self-requested a review June 28, 2021 09:42
@@ -0,0 +1,58 @@
# model settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the config.
There is MultiLevelNeck (MLN) in this config.

for m in self.modules():
if isinstance(m, nn.Conv2d):
xavier_init(m, distribution='uniform')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use xavier_init for Conv2d?
kaiming_init for ConvModule is used in MMCV.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed FPN


model = dict(
decode_head=dict(num_classes=150), auxiliary_head=dict(num_classes=150))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop_path_rate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ViT doesn't use drop path.


model = dict(
decode_head=dict(num_classes=150), auxiliary_head=dict(num_classes=150))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop_path_rate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ViT doesn't use drop path.


model = dict(
pretrained='https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth', # noqa
backbone=dict(num_heads=6, embed_dims=384, drop_path_rate=0.1, final_norm=True), # noqa
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the # noqa necessary?


model = dict(
pretrained='https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth', # noqa
backbone=dict(num_heads=6, embed_dims=384, drop_path_rate=0.1, final_norm=True), # noqa
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
backbone=dict(num_heads=6, embed_dims=384, drop_path_rate=0.1, final_norm=True), # noqa
backbone=dict(num_heads=6, embed_dims=384, drop_path_rate=0.1, final_norm=True),

| ------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | ---------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| UPerNet | ViT-B + neck | 512x512 | 80000 | 9.20 | 6.94 | 47.71 | 49.51 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_512x512_80k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k-0403cee1.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/20210624_130547.log.json) |
| UPerNet | ViT-B + neck | 512x512 | 160000 | 9.20 | 7.58 | 46.75 | 48.46 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_160k_ade20k-852fa768.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/20210623_192432.log.json) |
| UPerNet | ViT-B + norm | 512x512 | 160000 | 9.21 | 6.82 | 47.73 | 49.95 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_ln-backbone_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_ln-backbone_512x512_160k_ade20k-f444c077.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/20210621_172828.log.json) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| UPerNet | ViT-B + norm | 512x512 | 160000 | 9.21 | 6.82 | 47.73 | 49.95 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_ln-backbone_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_ln-backbone_512x512_160k_ade20k-f444c077.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/20210621_172828.log.json) |
| UPerNet | ViT-B + LN +MLN | 512x512 | 160000 | 9.21 | 6.82 | 47.73 | 49.95 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_ln-backbone_512x512_160k_ade20k-f444c077.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/20210621_172828.log.json) |


| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | ---------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| UPerNet | ViT-B + neck | 512x512 | 80000 | 9.20 | 6.94 | 47.71 | 49.51 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_512x512_80k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/upernet_vit-b16_neck_512x512_80k_ade20k-0403cee1.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/20210624_130547.log.json) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| UPerNet | ViT-B + neck | 512x512 | 80000 | 9.20 | 6.94 | 47.71 | 49.51 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_512x512_80k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/upernet_vit-b16_neck_512x512_80k_ade20k-0403cee1.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/20210624_130547.log.json) |
| UPerNet | ViT-B + MLN | 512x512 | 80000 | 9.20 | 6.94 | 47.71 | 49.51 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_neck_512x512_80k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/upernet_vit-b16_neck_512x512_80k_ade20k-0403cee1.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_neck_512x512_80k_ade20k/20210624_130547.log.json) |

| UPerNet | ViT-B + LN +MLN | 512x512 | 160000 | 9.21 | 6.82 | 47.73 | 49.95 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k/upernet_vit-b16_ln_mln_512x512_160k_ade20k-f444c077.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k/20210621_172828.log.json) |
| UPerNet | DeiT-S | 512x512 | 80000 | 4.68 | 29.85 | 42.96 | 43.79 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_deit-s16_512x512_80k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_80k_ade20k/upernet_deit-s16_512x512_80k_ade20k-afc93ec2.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_80k_ade20k/20210624_095228.log.json) |
| UPerNet | DeiT-S | 512x512 | 160000 | 4.68 | 29.19 | 42.87 | 43.79 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_deit-s16_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_160k_ade20k/upernet_deit-s16_512x512_160k_ade20k-5110d916.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_160k_ade20k/20210621_160903.log.json) |
| UPerNet | DeiT-S + neck | 512x512 | 160000 | 5.69 | 11.18 | 43.82 | 45.07 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_deit-s16_neck_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/upernet_deit-s16_neck_512x512_160k_ade20k-fb9a5dfb.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/20210621_161021.log.json) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| UPerNet | DeiT-S + neck | 512x512 | 160000 | 5.69 | 11.18 | 43.82 | 45.07 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_deit-s16_neck_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/upernet_deit-s16_neck_512x512_160k_ade20k-fb9a5dfb.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/20210621_161021.log.json) |
| UPerNet | DeiT-S + MLN | 512x512 | 160000 | 5.69 | 11.18 | 43.82 | 45.07 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit/upernet_deit-s16_neck_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/upernet_deit-s16_neck_512x512_160k_ade20k-fb9a5dfb.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_neck_512x512_160k_ade20k/20210621_161021.log.json) |

@@ -0,0 +1,6 @@
_base_ = './upernet_vit-b16_neck_512x512_160k_ade20k.py'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_base_ = './upernet_vit-b16_neck_512x512_160k_ade20k.py'
_base_ = './upernet_vit-b16_mln_512x512_160k_ade20k.py'

@Junjun2016 Junjun2016 merged commit 737544f into open-mmlab:master Jul 1, 2021
bowenroom pushed a commit to bowenroom/mmsegmentation that referenced this pull request Feb 25, 2022
* add config

* add cityscapes config

* add default value to docstring

* fix lint

* add deit-s and deit-b

* add readme

* add eps at norm_cfg

* add drop_path_rate experiment

* add deit case at init_weight

* add upernet result

* update result and add upernet 160k config

* update upernet result and fix settings

* Update iters number

* update result and delete some configs

* fix import error

* fix drop_path_rate

* update result and restore config

* update benchmark result

* remove cityscapes exp

* remove neck

* neck exp

* add more configs

* fix init error

* fix ffn setting

* update result

* update results

* update result

* update results and fill table

* delete or rename configs

* fix link delimiter

* rename configs and fix link

* rename neck to mln
sibozhang pushed a commit to sibozhang/mmsegmentation that referenced this pull request Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants