[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

FabianSchuetze · 2022-10-31T10:53:02Z

Update 31/01/2023

From now on we would pick up and support SegNeXt in master branch.
Here are our TO-DO-LIST:

Update README and SegNeXt configs
Refactor MSCAN and Ham Head.
Upload ckpts and upload their memory usage and FPS.
Add Unit Test

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Hello,

thanks for the fantastic repo, it's a pleasure to work with mmsegmentation.

I would like to help contribute SegNext to mmsegmentation. I discussed this with @MenghaoGuo and he appreciates such a contribution. Does mmsegmentation also appreciate the contribution?

I trained the tiny model on ADE20k which results in a val (test) mIuO of 39.27 (39.17), compared to the test mIoU of 41.1 reported in the paper. A few noteworthy aspects are:

I used their ImageNet pretrained from the Tsinghua cloud. I cannot do a pretraining from scratch. Is it OK to use the official weights?
I used one GPU, (not 8 as in the paper), but used the same settings. Should the learning rate be scaled then?
In contrast to the configs of the original repo, I did not use a RepeatDataset of size 50, but the conventional ADE dataset config because of a copy-and-paste glitch.

You can see the config logs at the end of the PR.

What would be the next steps for a contribution? I guess I should retrain the tiny network again after we agree on the correct settings for a one-GPU environment and then I run the experiment again. Do you have anything to add, @MenghaoGuo?

Best wishes,
Fabian

Modification

Contribute SegNext

BC-breaking (Optional)

n/a

Use cases (Optional)

n/a

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
The documentation has been modified accordingly, like docstring or example tutorials.

The pre-commit hooks passed, I guess the other ones are not applicable.

Config

norm_cfg = dict(type='BN', requires_grad=True)
ham_norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
model = dict(
    type='EncoderDecoder',
    pretrained=None,
    backbone=dict(
        type='MSCAN',
        embed_dims=[32, 64, 160, 256],
        mlp_ratios=[8, 8, 4, 4],
        drop_rate=0.0,
        drop_path_rate=0.1,
        depths=[3, 3, 5, 2],
        norm_cfg=dict(type='BN', requires_grad=True),
        init_cfg=dict(type='Pretrained', checkpoint='/notebooks/mscan_t.pth')),
    decode_head=dict(
        type='LightHamHead',
        in_channels=[64, 160, 256],
        in_index=[1, 2, 3],
        channels=256,
        ham_channels=256,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='GN', num_groups=32, requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
        ham_kwargs=dict(MD_R=16)),
    train_cfg=dict(),
    test_cfg=dict(mode='whole'))
dataset_type = 'ADE20KDataset'
data_root = '/notebooks/ADEChallengeData2016'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', reduce_zero_label=True),
    dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2048, 512),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='ResizeToMultiple', size_divisor=32),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=8,
    workers_per_gpu=4,
    train=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/training',
        ann_dir='annotations/training',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', reduce_zero_label=True),
            dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
            dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PhotoMetricDistortion'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 512),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='ResizeToMultiple', size_divisor=32),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 512),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='ResizeToMultiple', size_divisor=32),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = '/notebooks/mmsegmentation/work_dirs/segnext.tiny.512x512.ade.160k/latest.pth'
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=6e-05,
    betas=(0.9, 0.999),
    weight_decay=0.01,
    paramwise_cfg=dict(
        custom_keys=dict(
            pos_block=dict(decay_mult=0.0),
            norm=dict(decay_mult=0.0),
            head=dict(lr_mult=10.0))))
optimizer_config = dict()
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=False)
runner = dict(type='IterBasedRunner', max_iters=160000)
checkpoint_config = dict(by_epoch=False, interval=8000)
evaluation = dict(interval=8000, metric='mIoU')
find_unused_parameters = True
work_dir = './work_dirs/segnext.tiny.512x512.ade.160k'
gpu_ids = [0]
auto_resume = False

Logs:

root@nso2lj6bzv:/notebooks/mmsegmentation# cat work_dirs/result/eval_single_scale_20221031_102503.json 
{
    "config": "/notebooks/mmsegmentation/configs/segnext/segnext.tiny.512x512.ade.160k.py",
    "metric": {
        "aAcc": 0.7948999999999999,
        "mIoU": 0.39189999999999997,
        "mAcc": 0.5054,
        "IoU.wall": 0.73,
        "IoU.building": 0.8009999847412109,
        "IoU.sky": 0.9380999755859375,
        "IoU.floor": 0.7687000274658203,
        "IoU.tree": 0.7116999816894531,
        "IoU.ceiling": 0.7991999816894532,
        "IoU.road": 0.7959999847412109,
        "IoU.bed ": 0.8283999633789062,
        "IoU.windowpane": 0.5604000091552734,
        "IoU.grass": 0.6725,
        "IoU.cabinet": 0.5241999816894531,
        "IoU.sidewalk": 0.5954000091552735,
        "IoU.person": 0.7636000061035156,
        "IoU.earth": 0.3465999984741211,
        "IoU.door": 0.33880001068115234,
        "IoU.table": 0.4881000137329102,
        "IoU.mountain": 0.5815999984741211,
        "IoU.plant": 0.4777000045776367,
        "IoU.curtain": 0.6844999694824219,
        "IoU.chair": 0.4679000091552734,
        "IoU.car": 0.8025,
        "IoU.water": 0.48400001525878905,
        "IoU.painting": 0.6594999694824218,
        "IoU.sofa": 0.5527999877929688,
        "IoU.shelf": 0.3970000076293945,
        "IoU.house": 0.3931999969482422,
        "IoU.sea": 0.49139999389648437,
        "IoU.mirror": 0.49270000457763674,
        "IoU.rug": 0.47810001373291017,
        "IoU.field": 0.29239999771118164,
        "IoU.armchair": 0.3206999969482422,
        "IoU.seat": 0.5522999954223633,
        "IoU.fence": 0.3502000045776367,
        "IoU.desk": 0.39990001678466797,
        "IoU.rock": 0.3997000122070313,
        "IoU.wardrobe": 0.4111000061035156,
        "IoU.lamp": 0.5163000106811524,
        "IoU.bathtub": 0.6516999816894531,
        "IoU.railing": 0.2928000068664551,
        "IoU.cushion": 0.4331000137329102,
        "IoU.base": 0.21870000839233397,
        "IoU.box": 0.12720000267028808,
        "IoU.column": 0.4234000015258789,
        "IoU.signboard": 0.2957999992370606,
        "IoU.chest of drawers": 0.40240001678466797,
        "IoU.counter": 0.2545000076293945,
        "IoU.sand": 0.25739999771118166,
        "IoU.sink": 0.6093999862670898,
        "IoU.skyscraper": 0.5408000183105469,
        "IoU.fireplace": 0.6066999816894532,
        "IoU.refrigerator": 0.5993000030517578,
        "IoU.grandstand": 0.41639999389648436,
        "IoU.path": 0.23340000152587892,
        "IoU.stairs": 0.26899999618530274,
        "IoU.runway": 0.6816999816894531,
        "IoU.case": 0.47069999694824216,
        "IoU.pool table": 0.8627999877929687,
        "IoU.pillow": 0.44599998474121094,
        "IoU.screen door": 0.5118000030517578,
        "IoU.stairway": 0.29100000381469726,
        "IoU.river": 0.0965999984741211,
        "IoU.bridge": 0.37810001373291013,
        "IoU.bookcase": 0.33549999237060546,
        "IoU.blind": 0.3365000152587891,
        "IoU.coffee table": 0.45799999237060546,
        "IoU.toilet": 0.7277999877929687,
        "IoU.flower": 0.30700000762939456,
        "IoU.book": 0.40040000915527346,
        "IoU.hill": 0.061100001335144045,
        "IoU.bench": 0.3518000030517578,
        "IoU.countertop": 0.4545000076293945,
        "IoU.stove": 0.5797999954223633,
        "IoU.palm": 0.44490001678466795,
        "IoU.kitchen island": 0.21959999084472656,
        "IoU.computer": 0.5008000183105469,
        "IoU.swivel chair": 0.3611000061035156,
        "IoU.boat": 0.6530999755859375,
        "IoU.bar": 0.23090000152587892,
        "IoU.arcade machine": 0.455099983215332,
        "IoU.hovel": 0.5133000183105468,
        "IoU.bus": 0.7019999694824218,
        "IoU.towel": 0.4581999969482422,
        "IoU.light": 0.33860000610351565,
        "IoU.truck": 0.22510000228881835,
        "IoU.tower": 0.4729000091552734,
        "IoU.chandelier": 0.5947000122070313,
        "IoU.awning": 0.1815999984741211,
        "IoU.streetlight": 0.15640000343322755,
        "IoU.booth": 0.24860000610351562,
        "IoU.television receiver": 0.5845000076293946,
        "IoU.airplane": 0.5004000091552734,
        "IoU.dirt track": 0.018700000047683716,
        "IoU.apparel": 0.28170000076293944,
        "IoU.pole": 0.09680000305175782,
        "IoU.land": 0.07260000228881835,
        "IoU.bannister": 0.07170000076293945,
        "IoU.escalator": 0.3429000091552734,
        "IoU.ottoman": 0.2836000061035156,
        "IoU.bottle": 0.13829999923706054,
        "IoU.buffet": 0.29739999771118164,
        "IoU.poster": 0.1443000030517578,
        "IoU.stage": 0.04199999809265137,
        "IoU.van": 0.31510000228881835,
        "IoU.ship": 0.20579999923706055,
        "IoU.fountain": 0.008999999761581421,
        "IoU.conveyer belt": 0.6286999893188476,
        "IoU.canopy": 0.118100004196167,
        "IoU.washer": 0.6141999816894531,
        "IoU.plaything": 0.1647999954223633,
        "IoU.swimming pool": 0.3170000076293945,
        "IoU.stool": 0.17389999389648436,
        "IoU.barrel": 0.07260000228881835,
        "IoU.basket": 0.20059999465942382,
        "IoU.waterfall": 0.590900001525879,
        "IoU.tent": 0.9519000244140625,
        "IoU.bag": 0.03279999971389771,
        "IoU.minibike": 0.48520000457763673,
        "IoU.cradle": 0.6916000366210937,
        "IoU.oven": 0.19579999923706054,
        "IoU.ball": 0.44919998168945313,
        "IoU.food": 0.5677999877929687,
        "IoU.step": 0.009700000286102295,
        "IoU.tank": 0.19309999465942382,
        "IoU.trade name": 0.173700008392334,
        "IoU.microwave": 0.33180000305175783,
        "IoU.pot": 0.2625,
        "IoU.animal": 0.58,
        "IoU.bicycle": 0.41529998779296873,
        "IoU.lake": 0.0625,
        "IoU.dishwasher": 0.42189998626708985,
        "IoU.screen": 0.555,
        "IoU.blanket": 0.016399999856948854,
        "IoU.sculpture": 0.313700008392334,
        "IoU.hood": 0.3725,
        "IoU.sconce": 0.146899995803833,
        "IoU.vase": 0.24030000686645508,
        "IoU.traffic light": 0.12489999771118164,
        "IoU.tray": 0.0056999999284744265,
        "IoU.ashcan": 0.2915999984741211,
        "IoU.fan": 0.4765999984741211,
        "IoU.pier": 0.4829999923706055,
        "IoU.crt screen": 0.03130000114440918,
        "IoU.plate": 0.3902000045776367,
        "IoU.monitor": 0.037100000381469725,
        "IoU.bulletin board": 0.3320000076293945,
        "IoU.shower": 0.0,
        "IoU.radiator": 0.42619998931884767,
        "IoU.glass": 0.05260000228881836,
        "IoU.clock": 0.1518000030517578,
        "IoU.flag": 0.2221999931335449,
        "Acc.wall": 0.8626000213623047,
        "Acc.building": 0.9127999877929688,
        "Acc.sky": 0.9704000091552735,
        "Acc.floor": 0.8887999725341796,
        "Acc.tree": 0.8784999847412109,
        "Acc.ceiling": 0.8973999786376953,
        "Acc.road": 0.8702999877929688,
        "Acc.bed ": 0.9338999938964844,
        "Acc.windowpane": 0.7416000366210938,
        "Acc.grass": 0.8504000091552735,
        "Acc.cabinet": 0.6558000183105469,
        "Acc.sidewalk": 0.7759999847412109,
        "Acc.person": 0.9019000244140625,
        "Acc.earth": 0.4659000015258789,
        "Acc.door": 0.4518999862670898,
        "Acc.table": 0.6612000274658203,
        "Acc.mountain": 0.7533000183105468,
        "Acc.plant": 0.5856999969482422,
        "Acc.curtain": 0.831500015258789,
        "Acc.chair": 0.6184000015258789,
        "Acc.car": 0.893499984741211,
        "Acc.water": 0.6462999725341797,
        "Acc.painting": 0.8241999816894531,
        "Acc.sofa": 0.7515000152587891,
        "Acc.shelf": 0.5681000137329102,
        "Acc.house": 0.5572999954223633,
        "Acc.sea": 0.7655000305175781,
        "Acc.mirror": 0.586500015258789,
        "Acc.rug": 0.595,
        "Acc.field": 0.4231999969482422,
        "Acc.armchair": 0.45990001678466796,
        "Acc.seat": 0.7380999755859375,
        "Acc.fence": 0.48150001525878905,
        "Acc.desk": 0.5991999816894531,
        "Acc.rock": 0.6040000152587891,
        "Acc.wardrobe": 0.645,
        "Acc.lamp": 0.6594000244140625,
        "Acc.bathtub": 0.7575,
        "Acc.railing": 0.40490001678466797,
        "Acc.cushion": 0.5472999954223633,
        "Acc.base": 0.35639999389648436,
        "Acc.box": 0.18600000381469728,
        "Acc.column": 0.5604000091552734,
        "Acc.signboard": 0.40630001068115235,
        "Acc.chest of drawers": 0.5597000122070312,
        "Acc.counter": 0.33049999237060546,
        "Acc.sand": 0.4638999938964844,
        "Acc.sink": 0.7030999755859375,
        "Acc.skyscraper": 0.6627999877929688,
        "Acc.fireplace": 0.7643000030517578,
        "Acc.refrigerator": 0.7184999847412109,
        "Acc.grandstand": 0.6483999633789063,
        "Acc.path": 0.36400001525878906,
        "Acc.stairs": 0.34810001373291016,
        "Acc.runway": 0.9026999664306641,
        "Acc.case": 0.652300033569336,
        "Acc.pool table": 0.9526000213623047,
        "Acc.pillow": 0.5706999969482421,
        "Acc.screen door": 0.763499984741211,
        "Acc.stairway": 0.39169998168945314,
        "Acc.river": 0.15069999694824218,
        "Acc.bridge": 0.4402000045776367,
        "Acc.bookcase": 0.49,
        "Acc.blind": 0.37970001220703126,
        "Acc.coffee table": 0.7284999847412109,
        "Acc.toilet": 0.8744999694824219,
        "Acc.flower": 0.41220001220703123,
        "Acc.book": 0.6084999847412109,
        "Acc.hill": 0.08260000228881836,
        "Acc.bench": 0.4829999923706055,
        "Acc.countertop": 0.6377000045776368,
        "Acc.stove": 0.7225,
        "Acc.palm": 0.5758000183105468,
        "Acc.kitchen island": 0.4759999847412109,
        "Acc.computer": 0.634900016784668,
        "Acc.swivel chair": 0.49990001678466794,
        "Acc.boat": 0.7905000305175781,
        "Acc.bar": 0.2902000045776367,
        "Acc.arcade machine": 0.46830001831054685,
        "Acc.hovel": 0.6544000244140625,
        "Acc.bus": 0.8451999664306641,
        "Acc.towel": 0.5629999923706055,
        "Acc.light": 0.37869998931884763,
        "Acc.truck": 0.30020000457763674,
        "Acc.tower": 0.6433000183105468,
        "Acc.chandelier": 0.7401000213623047,
        "Acc.awning": 0.22799999237060548,
        "Acc.streetlight": 0.19899999618530273,
        "Acc.booth": 0.38799999237060545,
        "Acc.television receiver": 0.7548000335693359,
        "Acc.airplane": 0.6031999969482422,
        "Acc.dirt track": 0.045900001525878906,
        "Acc.apparel": 0.40330001831054685,
        "Acc.pole": 0.12,
        "Acc.land": 0.1315999984741211,
        "Acc.bannister": 0.10680000305175781,
        "Acc.escalator": 0.5161999893188477,
        "Acc.ottoman": 0.41900001525878905,
        "Acc.bottle": 0.15989999771118163,
        "Acc.buffet": 0.3497999954223633,
        "Acc.poster": 0.17440000534057618,
        "Acc.stage": 0.08460000038146973,
        "Acc.van": 0.39439998626708983,
        "Acc.ship": 0.21520000457763672,
        "Acc.fountain": 0.008999999761581421,
        "Acc.conveyer belt": 0.8201000213623046,
        "Acc.canopy": 0.15289999961853026,
        "Acc.washer": 0.6916000366210937,
        "Acc.plaything": 0.28319999694824216,
        "Acc.swimming pool": 0.3927000045776367,
        "Acc.stool": 0.21760000228881837,
        "Acc.barrel": 0.6620999908447266,
        "Acc.basket": 0.3035000038146973,
        "Acc.waterfall": 0.7,
        "Acc.tent": 0.9720999908447265,
        "Acc.bag": 0.03650000095367432,
        "Acc.minibike": 0.5818000030517578,
        "Acc.cradle": 0.897300033569336,
        "Acc.oven": 0.3895000076293945,
        "Acc.ball": 0.6133000183105469,
        "Acc.food": 0.7027999877929687,
        "Acc.step": 0.01059999942779541,
        "Acc.tank": 0.19739999771118164,
        "Acc.trade name": 0.19610000610351563,
        "Acc.microwave": 0.3802000045776367,
        "Acc.pot": 0.31190000534057616,
        "Acc.animal": 0.6163000106811524,
        "Acc.bicycle": 0.6181999969482422,
        "Acc.lake": 0.07440000057220458,
        "Acc.dishwasher": 0.5559999847412109,
        "Acc.screen": 0.7576000213623046,
        "Acc.blanket": 0.017200000286102295,
        "Acc.sculpture": 0.4109000015258789,
        "Acc.hood": 0.4022999954223633,
        "Acc.sconce": 0.16760000228881836,
        "Acc.vase": 0.3618000030517578,
        "Acc.traffic light": 0.22139999389648438,
        "Acc.tray": 0.00699999988079071,
        "Acc.ashcan": 0.37650001525878907,
        "Acc.fan": 0.5984999847412109,
        "Acc.pier": 0.605999984741211,
        "Acc.crt screen": 0.09609999656677246,
        "Acc.plate": 0.4872999954223633,
        "Acc.monitor": 0.04130000114440918,
        "Acc.bulletin board": 0.417400016784668,
        "Acc.shower": 0.0,
        "Acc.radiator": 0.4936999893188477,
        "Acc.glass": 0.05570000171661377,
        "Acc.clock": 0.17290000915527343,
        "Acc.flag": 0.23879999160766602
    }

CLAassistant · 2022-10-31T10:53:17Z

All committers have signed the CLA.

MengzhangLI · 2022-10-31T11:25:16Z

Guten tag @FabianSchuetze. Thanks for your nice PR.

Please sign CLA liscense first, we would review it ASAP.

Best,

P.S. We absolutely welcome your excellent contribution and actually in our developing plan, SegNeXt is priority model in next few month.

codecov · 2022-10-31T11:45:14Z

Codecov Report

Base: 88.97% // Head: 86.87% // Decreases project coverage by -2.10% ⚠️

Coverage data is based on head (b8c6aaa) compared to base (b42c487).
Patch coverage: 22.53% of modified lines in pull request are covered.

❗ Current head b8c6aaa differs from pull request most recent head 319dbda. Consider uploading reports for the commit 319dbda to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2247      +/-   ##
==========================================
- Coverage   88.97%   86.87%   -2.10%     
==========================================
  Files         145      148       +3     
  Lines        8735     9030     +295     
  Branches     1473     1506      +33     
==========================================
+ Hits         7772     7845      +73     
- Misses        720      942     +222     
  Partials      243      243

Flag	Coverage Δ
unittests	`86.87% <22.53%> (-2.10%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmseg/models/backbones/mscan.py	`21.95% <21.95%> (ø)`
mmseg/models/decode_heads/ham_head.py	`22.03% <22.03%> (ø)`
mmseg/models/backbones/__init__.py	`100.00% <100.00%> (ø)`
mmseg/models/decode_heads/__init__.py	`100.00% <100.00%> (ø)`
mmseg/datasets/__init__.py	`100.00% <0.00%> (ø)`
mmseg/datasets/face.py	`80.00% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

FabianSchuetze · 2022-10-31T18:10:53Z

Hi @MengzhangLI - thank you so much for your kind reply! Happy to hear mmsegmentation wants to include SegNext.

I signed the CLA and also added appropriate attribution to SegNext in the two model files - hope that's OK.

Looking forward to the review and the discussion about the empirical results and the model.

MengzhangLI · 2022-11-02T03:25:23Z

Guten abend, Fabian:

Q1: "I used their ImageNet pretrained from the Tsinghua cloud. I cannot do a pretraining from scratch. Is it OK to use the official weights?"

A1: Do you mean you want to train from scratch or you meet some problems when using pretrain models from Tsinghua cloud? If you want to train from scratch, just need to set init_cfg=None.

Q2: "I used one GPU, (not 8 as in the paper), but used the same settings. Should the learning rate be scaled then?"

A2: Theoretically, no difference if total batch sizes are the same. If only using one GPU, the samplers per GPU should be 8 times larger than original settings.

Q3: "In contrast to the configs of the original repo, I did not use a RepeatDataset of size 50, but the conventional ADE dataset config because of a copy-and-paste glitch."

A3: "RepeatDataset" is used in some models which use MMSegmentation as their framework, such as SegFormer, PoolFormer. The difference between w/o and w should be very small or little. Please check this issue.

MengzhangLI · 2022-11-02T07:06:10Z

Hi, Fabian, could you grant authorization on me about your forked MMSegmentation followed here? Thus I could push my modifications on your branch.

FabianSchuetze · 2022-11-02T08:07:35Z

Hi Li! Sure, how wonderful to see your additions. You should have received an invite.

Thanks also for the answers to the questions. I have also asked MenghaoGuo for clarification because I cannot reconcile the configs in their repo with the description in the paper. I will try training the model again very soon.

P.S: I hope I selected you first name correctly :-). Sometimes I get confused with the ordering of Chinese names.

MengzhangLI · 2022-11-02T10:10:39Z

mmseg/models/backbones/mscan.py

+        self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim)
+        self.conv0_1 = nn.Conv2d(dim, dim, (1, 7), padding=(0, 3), groups=dim)
+        self.conv0_2 = nn.Conv2d(dim, dim, (7, 1), padding=(3, 0), groups=dim)
+
+        self.conv1_1 = nn.Conv2d(dim, dim, (1, 11), padding=(0, 5), groups=dim)
+        self.conv1_2 = nn.Conv2d(dim, dim, (11, 1), padding=(5, 0), groups=dim)
+
+        self.conv2_1 = nn.Conv2d(
+            dim, dim, (1, 21), padding=(0, 10), groups=dim)
+        self.conv2_2 = nn.Conv2d(
+            dim, dim, (21, 1), padding=(10, 0), groups=dim)
+        self.conv3 = nn.Conv2d(dim, dim, 1)


We would better extract these hard code convolution dimensions (i.e., 1x7, 1x11, 1x21) of MSCA into config file.

MengzhangLI · 2022-11-02T11:52:32Z

Also, MMSeg-Style Docstring would be added.

MengzhangLI · 2022-11-03T08:43:20Z

mmseg/models/decode_heads/ham_head.py

+        print('spatial', self.spatial)
+        print('S', self.S)
+        print('D', self.D)
+        print('R', self.R)
+        print('train_steps', self.train_steps)
+        print('eval_steps', self.eval_steps)
+        print('inv_t', self.inv_t)
+        print('eta', self.eta)
+        print('rand_init', self.rand_init)


We may delete these lines and add these args as parameters in config files.

MengzhangLI · 2022-11-04T06:36:26Z

configs/segnext/segnext_tiny_512x512_adamw_160k_ade20.py

+# optimizer
+optimizer = dict(
+    _delete_=True,
+    type='AdamW',
+    lr=0.00006,
+    betas=(0.9, 0.999),
+    weight_decay=0.01,
+    paramwise_cfg=dict(
+        custom_keys={
+            'pos_block': dict(decay_mult=0.),
+            'norm': dict(decay_mult=0.),
+            'head': dict(lr_mult=10.)
+        }))
+
+lr_config = dict(
+    _delete_=True,
+    policy='poly',
+    warmup='linear',
+    warmup_iters=1500,
+    warmup_ratio=1e-6,
+    power=1.0,
+    min_lr=0.0,
+    by_epoch=False)


These lines are redundant which repeats below.

FabianSchuetze · 2022-11-04T07:56:28Z

Thanks for all the comments so far, @MengzhangLI! I will incorporate them tomorrow.

FabianSchuetze · 2022-11-05T10:59:25Z

Thanks for the three comments, @MengzhangLI . My last commit contains that following changes:

Renamed the layers in AttentionModule to be more consistent with the naming in the paper.
Removed the duplicated config files and the _delete_ values in the config files.
Renamed the model config from mscan to segnext to reflect that the config includes both backbone and head
Lifted the hardcoded values from the implementation to the config files. Not sure if the naming is OK.

Can you explain me what you mean by saying "MMSeg-Style Docstring would be added"?

MengzhangLI · 2022-11-06T09:04:46Z

Sorry for my misleading words, the so-called 'mmseg-style' docstring is our default docstring in files, such as forward function which has args and returns explanations and class which has explanations for each args. However, adding docstrings are not our priority in this moment. If you don't mind, I can add docstrings in the next week.

Best,

FabianSchuetze · 2022-11-06T16:47:05Z

The first results came in: At commit c56d243, the results for the tiny model on ADE20k were 41.62 mIoU (SS), slightly above the ref mIoU of 41.1.

I used one GPU and a batch size of 16 (as the authors did), instead of the 4x2 default. I do not have access to 4 GPUs. Should I change and train with a bs of 8 instead to be more conformable to the default?

MengzhangLI · 2022-11-07T07:57:13Z

Thanks for your feedback. I think your setting is correct, because the batch size of ADE20K is 16 rather than 8. The total batch size should match up with original paper.

MengzhangLI · 2022-11-07T08:23:20Z

mmseg/models/backbones/mscan.py

+        return x, H, W
+
+
+class AttentionModule(BaseModule):


Why replace the name convx with scalex in class AttentionModule(BaseModule)?

Thanks for the comments. Maybe you are right and I should revert to the original name. I converted the name to be more in line with the description of the paper (equation (1)) but I guess its preferable to revert it to covnx?

I came to conclude that renaming wasn't such a great idea as it precludes loading the pre-trained weights from the original source. I will revert the changes again.

MengzhangLI · 2022-11-07T09:24:49Z

Could you attach your training log like 2022110xxxxx.log to let me have a look? I am trying to re-implement results. Thanks in advance.

2022-11-09 Update

Currently results are below, still existing some gaps.

Model	Original Repo Results	Original Repo reimplementing	MMSeg PR
MSCAN-T	41.1	39.44	TBD
MSCAN-S	44.3	42.34	43.13
MSCAN-B	48.5	46.74	47.63
MSCAN-L	51.0	49.14	49.04

Also, I noticed if we modified `BN` to `SyncBN` here, the model performance would be very bad.

I would figure it out in next few days.

Best,

FabianSchuetze · 2022-11-09T08:41:39Z

Thanks for the very insightful comment, @MengzhangLI!

Here is the final part of the log for the training run : 20221104_183356.log. I needed to train in a few steps because my access to the server is cutoff after six hours. The first part is here: 20221104_183356.log The single scale eval results come to: 0.41619999999999996.

Gee, the results about SyncBatchNorm are very interesting. However, the authors said they used 2 or 4 GPUs to train the models, so they must have used SyncBN?

…en-mmlab#2489) Fix: open-mmlab#2478

…2480) ## Motivation Based on the ImageNet dataset, we propose the ImageNet-S dataset has 1.2 million training images and 50k high-quality semantic segmentation annotations to support unsupervised/semi-supervised semantic segmentation on the ImageNet dataset. paper: Large-scale Unsupervised Semantic Segmentation (TPAMI 2022) [Paper link](https://arxiv.org/abs/2106.03149) ## Modification 1. Support imagenet-s dataset and its' configuration 2. Add the dataset preparation in the documentation

…pen-mmlab#2500) ## Motivation I want to fix a bug through this PR. The bug occurs when two options -- `reduce_zero_label=True`, and custom classes are used. `reduce_zero_label` remaps the GT seg labels by remapping the zero-class to 255 which is ignored. Conceptually, this should occur *before* the `label_map` is applied, which maps *already reduced labels*. However, currently, the `label_map` is applied before the zero label is reduced. ## Modification The modification is simple: - I've just interchanged the order of the two operations by moving 4 lines from bottom to top. - I've added a test that passes when the fix is introduced, and fails on the original `master` branch. ## BC-breaking (Optional) I do not anticipate this change braking any backward-compatibility. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _I've added a unit test._ - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._

## Motivation This fixes open-mmlab#2493. When the `label_map` is created, the index for ignored classes was being set to -1, whereas the index that is actually ignored is 255. This worked indirectly since -1 was underflowed to 255 when converting to uint8. The same fix was made in the 1.x by open-mmlab#2332 but this fix was never made to `master`. ## Modification The only small modification is setting the index of ignored classes to 255 instead of -1. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _No unit tests need to be added. Unit tests that are affected were modified. - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._

…pen-mmlab#2520) ## Motivation open-mmlab/mmeval#85 --------- Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

## Motivation Through this PR, I (1) fix a bug, and (2) perform some associated cleanup, and (3) add a unit test. The bug occurs during evaluation when two options -- `reduce_zero_label=True`, and custom classes are used. The bug was that the `reduce_zero_label` is not properly propagated (see details below). ## Modification 1. **Bugfix** The bug occurs [in the initialization of `CustomDataset`](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L108-L110) where the `reduce_zero_label` flag is not propagated to its member `self.gt_seg_map_loader_cfg`: ```python self.gt_seg_map_loader = LoadAnnotations( ) if gt_seg_map_loader_cfg is None else LoadAnnotations( **gt_seg_map_loader_cfg) ``` Because the `reduce_zero_label` flag was not being propagated, the zero label reduction was being [unnecessarily and explicitly duplicated during the evaluation](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/core/evaluation/metrics.py#L66-L69). As pointed in a previous PR (open-mmlab#2500), `reduce_zero_label` must occur before applying the `label_map`. Due to this bug, the order gets reversed when both features are used simultaneously. This has been fixed to: ```python self.gt_seg_map_loader = LoadAnnotations( reduce_zero_label=reduce_zero_label, **gt_seg_map_loader_cfg) ``` 2. **Cleanup** Due to the bug fix, since both `reduce_zero_label` and `label_map` are being applied in `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`), the evaluation does not need perform them anymore. However, for backwards compatibility, the evaluation keeps previous input arguments. This was pointed out for `label_map` in a previous issue (open-mmlab#1415) that the `label_map` should not be applied in the evaluation. This was handled by [passing an empty dict](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L306-L311): ```python # as the labels has been converted when dataset initialized # in `get_palette_for_custom_classes ` this `label_map` # should be `dict()`, see # open-mmlab#1415 # for more ditails label_map=dict(), reduce_zero_label=self.reduce_zero_label)) ``` Similar to this, I now also set `reduce_label=False` since it is now also being handled by `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`). 3. **Unit test** I've added a unit test that tests the `CustomDataset.pre_eval()` function when `reduce_zero_label=True` and custom classes are used. The test fails on the original `master` branch but passes with this fix. ## BC-breaking (Optional) I do not anticipate this change braking any backward-compatibility. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _I've added a test that passes when the fix is introduced, and fails on the original master branch._ - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._

FabianSchuetze · 2023-02-14T13:44:29Z

It's so nice to see that you made some progress, @MengzhangLI ! Do you want me to test anything?

MengzhangLI · 2023-02-14T14:22:30Z

Hi, @FabianSchuetze Sorry for late reply.
I create a new PR: #2600 because this PR has too many commits and changed files, but we would prior merge this PR if we could solve this question.

Let me summarize what we discussed several months ago:

The problem is caused by PyTorch 1.9 version + SyncBN on mscan.py, in other PyTorch version SyncBN is OK. So, we use BN on single GPU as default config setting. So for everyone, it is OK to use SyncBN with 2 samplers per GPU with 8 GPUs when pytorch version is not 1.9.

Best,

## Motivation Support SegNeXt. Due to many commits & changed files caused by WIP too long (perhaps it could be resolved by `git merge` or `git rebase`). This PR is created only for backup of old PR #2247 Co-authored-by: MeowZheng <meowzheng@outlook.com> Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>

FabianSchuetze · 2023-02-24T10:13:49Z

Wonderful!

OpenMMLab-Assistant001 · 2023-04-13T11:51:42Z

Hi @FabianSchuetze ！We are grateful for your efforts in helping improve this open-source project during your personal time.

Welcome to join OpenMMLab Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us ：https://discord.gg/UjgXkPWNqA
If you have a WeChat account，welcome to join our community on WeChat. You can add our assistant ：openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends：）
Thank you again for your contribution❤

FabianSchuetze added 3 commits October 29, 2022 13:24

configs

1310a5d

trains

8f7deca

correct configs

79e2482

mm-assistant bot assigned MengzhangLI Oct 31, 2022

proper attribution

4a20eda

MengzhangLI changed the title ~~Contribute Segnext~~ [Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. Nov 2, 2022

FabianSchuetze mentioned this pull request Nov 2, 2022

Learning Rate and Batch Size Visual-Attention-Network/SegNeXt#32

Closed

MengzhangLI reviewed Nov 2, 2022

View reviewed changes

adjust config for better training

c56d243

MengzhangLI reviewed Nov 3, 2022

View reviewed changes

MengzhangLI reviewed Nov 4, 2022

View reviewed changes

Lifted hardcoded values to config files

b3b85c9

MengzhangLI reviewed Nov 7, 2022

View reviewed changes

xiexinch mentioned this pull request Nov 9, 2022

Torch2onnx export model error! #2277

Closed

revert original naming and permit cpu device

b8c6aaa

MeowZheng added the High Priority from Community This issue/pr needs more attention and higher priority than default developing plan label Jan 3, 2023

xiexinch and others added 19 commits January 11, 2023 17:39

bump v0.30.0 (open-mmlab#2462)

ed83982

[Fix] Fix no revert_sync_batchnorm in image_demo of master branch (op…

ba7608c

…en-mmlab#2489) Fix: open-mmlab#2478

[CI] Upgrade the version of isort to fix lint error in master branch (o…

ac5d650

…pen-mmlab#2520) ## Motivation open-mmlab/mmeval#85 --------- Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

configs

b29cc34

trains

3810e1f

correct configs

b570699

proper attribution

4ed0033

adjust config for better training

3f529ba

Lifted hardcoded values to config files

d4f8b52

update readme and refactor code

68a33ef

fix conflict

319dbda

refactor segnext

83b9c69

add ut

688be9d

add DWConv module

5f2bb25

upload tiny&small&large models & logs

22e044d

MengzhangLI added 2 commits February 14, 2023 21:49

delete mscan.py in config

4ddea0f

rename mscan.py

51e129f

MengzhangLI mentioned this pull request Feb 14, 2023

[NEW][Feature]Support SegNeXt(NeurIPS'2022) in master branch #2600

Merged

MengzhangLI removed the WIP Work in process label Feb 24, 2023

MengzhangLI closed this Feb 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

FabianSchuetze commented Oct 31, 2022 •

edited by MengzhangLI

Loading

CLAassistant commented Oct 31, 2022 •

edited

Loading

MengzhangLI commented Oct 31, 2022 •

edited

Loading

codecov bot commented Oct 31, 2022 •

edited

Loading

FabianSchuetze commented Oct 31, 2022

MengzhangLI commented Nov 2, 2022

MengzhangLI commented Nov 2, 2022

FabianSchuetze commented Nov 2, 2022

MengzhangLI Nov 2, 2022 •

edited

Loading

MengzhangLI commented Nov 2, 2022

MengzhangLI Nov 3, 2022

MengzhangLI Nov 4, 2022

FabianSchuetze commented Nov 4, 2022

FabianSchuetze commented Nov 5, 2022

MengzhangLI commented Nov 6, 2022

FabianSchuetze commented Nov 6, 2022

MengzhangLI commented Nov 7, 2022

MengzhangLI Nov 7, 2022

FabianSchuetze Nov 9, 2022

FabianSchuetze Nov 10, 2022

MengzhangLI commented Nov 7, 2022 •

edited

Loading

FabianSchuetze commented Nov 9, 2022

FabianSchuetze commented Feb 14, 2023

MengzhangLI commented Feb 14, 2023

FabianSchuetze commented Feb 24, 2023

OpenMMLab-Assistant001 commented Apr 13, 2023

[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

Conversation

FabianSchuetze commented Oct 31, 2022 • edited by MengzhangLI Loading

Update 31/01/2023

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

CLAassistant commented Oct 31, 2022 • edited Loading

MengzhangLI commented Oct 31, 2022 • edited Loading

codecov bot commented Oct 31, 2022 • edited Loading

Codecov Report

FabianSchuetze commented Oct 31, 2022

MengzhangLI commented Nov 2, 2022

MengzhangLI commented Nov 2, 2022

FabianSchuetze commented Nov 2, 2022

MengzhangLI Nov 2, 2022 • edited Loading

Choose a reason for hiding this comment

MengzhangLI commented Nov 2, 2022

MengzhangLI Nov 3, 2022

Choose a reason for hiding this comment

MengzhangLI Nov 4, 2022

Choose a reason for hiding this comment

FabianSchuetze commented Nov 4, 2022

FabianSchuetze commented Nov 5, 2022

MengzhangLI commented Nov 6, 2022

FabianSchuetze commented Nov 6, 2022

MengzhangLI commented Nov 7, 2022

MengzhangLI Nov 7, 2022

Choose a reason for hiding this comment

FabianSchuetze Nov 9, 2022

Choose a reason for hiding this comment

FabianSchuetze Nov 10, 2022

Choose a reason for hiding this comment

MengzhangLI commented Nov 7, 2022 • edited Loading

2022-11-09 Update

Currently results are below, still existing some gaps.

Also, I noticed if we modified BN to SyncBN here, the model performance would be very bad.

FabianSchuetze commented Nov 9, 2022

FabianSchuetze commented Feb 14, 2023

MengzhangLI commented Feb 14, 2023

FabianSchuetze commented Feb 24, 2023

OpenMMLab-Assistant001 commented Apr 13, 2023

FabianSchuetze commented Oct 31, 2022 •

edited by MengzhangLI

Loading

CLAassistant commented Oct 31, 2022 •

edited

Loading

MengzhangLI commented Oct 31, 2022 •

edited

Loading

codecov bot commented Oct 31, 2022 •

edited

Loading

MengzhangLI Nov 2, 2022 •

edited

Loading

MengzhangLI commented Nov 7, 2022 •

edited

Loading

Also, I noticed if we modified `BN` to `SyncBN` here, the model performance would be very bad.