Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247

Closed
wants to merge 28 commits into from

Conversation

FabianSchuetze
Copy link

@FabianSchuetze FabianSchuetze commented Oct 31, 2022

Update 31/01/2023

From now on we would pick up and support SegNeXt in master branch.
Here are our TO-DO-LIST:

  • Update README and SegNeXt configs
  • Refactor MSCAN and Ham Head.
  • Upload ckpts and upload their memory usage and FPS.
  • Add Unit Test

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Hello,

thanks for the fantastic repo, it's a pleasure to work with mmsegmentation.

I would like to help contribute SegNext to mmsegmentation. I discussed this with @MenghaoGuo and he appreciates such a contribution. Does mmsegmentation also appreciate the contribution?

I trained the tiny model on ADE20k which results in a val (test) mIuO of 39.27 (39.17), compared to the test mIoU of 41.1 reported in the paper. A few noteworthy aspects are:

  • I used their ImageNet pretrained from the Tsinghua cloud. I cannot do a pretraining from scratch. Is it OK to use the official weights?
  • I used one GPU, (not 8 as in the paper), but used the same settings. Should the learning rate be scaled then?
  • In contrast to the configs of the original repo, I did not use a RepeatDataset of size 50, but the conventional ADE dataset config because of a copy-and-paste glitch.

You can see the config logs at the end of the PR.

What would be the next steps for a contribution? I guess I should retrain the tiny network again after we agree on the correct settings for a one-GPU environment and then I run the experiment again. Do you have anything to add, @MenghaoGuo?

Best wishes,
Fabian

Modification

Contribute SegNext

BC-breaking (Optional)

n/a

Use cases (Optional)

n/a

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

The pre-commit hooks passed, I guess the other ones are not applicable.

Config

norm_cfg = dict(type='BN', requires_grad=True)
ham_norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
model = dict(
    type='EncoderDecoder',
    pretrained=None,
    backbone=dict(
        type='MSCAN',
        embed_dims=[32, 64, 160, 256],
        mlp_ratios=[8, 8, 4, 4],
        drop_rate=0.0,
        drop_path_rate=0.1,
        depths=[3, 3, 5, 2],
        norm_cfg=dict(type='BN', requires_grad=True),
        init_cfg=dict(type='Pretrained', checkpoint='/notebooks/mscan_t.pth')),
    decode_head=dict(
        type='LightHamHead',
        in_channels=[64, 160, 256],
        in_index=[1, 2, 3],
        channels=256,
        ham_channels=256,
        dropout_ratio=0.1,
        num_classes=150,
        norm_cfg=dict(type='GN', num_groups=32, requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
        ham_kwargs=dict(MD_R=16)),
    train_cfg=dict(),
    test_cfg=dict(mode='whole'))
dataset_type = 'ADE20KDataset'
data_root = '/notebooks/ADEChallengeData2016'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', reduce_zero_label=True),
    dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2048, 512),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='ResizeToMultiple', size_divisor=32),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=8,
    workers_per_gpu=4,
    train=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/training',
        ann_dir='annotations/training',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', reduce_zero_label=True),
            dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
            dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PhotoMetricDistortion'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 512),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='ResizeToMultiple', size_divisor=32),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='ADE20KDataset',
        data_root='/notebooks/ADEChallengeData2016',
        img_dir='images/validation',
        ann_dir='annotations/validation',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 512),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='ResizeToMultiple', size_divisor=32),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = '/notebooks/mmsegmentation/work_dirs/segnext.tiny.512x512.ade.160k/latest.pth'
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=6e-05,
    betas=(0.9, 0.999),
    weight_decay=0.01,
    paramwise_cfg=dict(
        custom_keys=dict(
            pos_block=dict(decay_mult=0.0),
            norm=dict(decay_mult=0.0),
            head=dict(lr_mult=10.0))))
optimizer_config = dict()
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=False)
runner = dict(type='IterBasedRunner', max_iters=160000)
checkpoint_config = dict(by_epoch=False, interval=8000)
evaluation = dict(interval=8000, metric='mIoU')
find_unused_parameters = True
work_dir = './work_dirs/segnext.tiny.512x512.ade.160k'
gpu_ids = [0]
auto_resume = False

Logs:

root@nso2lj6bzv:/notebooks/mmsegmentation# cat work_dirs/result/eval_single_scale_20221031_102503.json 
{
    "config": "/notebooks/mmsegmentation/configs/segnext/segnext.tiny.512x512.ade.160k.py",
    "metric": {
        "aAcc": 0.7948999999999999,
        "mIoU": 0.39189999999999997,
        "mAcc": 0.5054,
        "IoU.wall": 0.73,
        "IoU.building": 0.8009999847412109,
        "IoU.sky": 0.9380999755859375,
        "IoU.floor": 0.7687000274658203,
        "IoU.tree": 0.7116999816894531,
        "IoU.ceiling": 0.7991999816894532,
        "IoU.road": 0.7959999847412109,
        "IoU.bed ": 0.8283999633789062,
        "IoU.windowpane": 0.5604000091552734,
        "IoU.grass": 0.6725,
        "IoU.cabinet": 0.5241999816894531,
        "IoU.sidewalk": 0.5954000091552735,
        "IoU.person": 0.7636000061035156,
        "IoU.earth": 0.3465999984741211,
        "IoU.door": 0.33880001068115234,
        "IoU.table": 0.4881000137329102,
        "IoU.mountain": 0.5815999984741211,
        "IoU.plant": 0.4777000045776367,
        "IoU.curtain": 0.6844999694824219,
        "IoU.chair": 0.4679000091552734,
        "IoU.car": 0.8025,
        "IoU.water": 0.48400001525878905,
        "IoU.painting": 0.6594999694824218,
        "IoU.sofa": 0.5527999877929688,
        "IoU.shelf": 0.3970000076293945,
        "IoU.house": 0.3931999969482422,
        "IoU.sea": 0.49139999389648437,
        "IoU.mirror": 0.49270000457763674,
        "IoU.rug": 0.47810001373291017,
        "IoU.field": 0.29239999771118164,
        "IoU.armchair": 0.3206999969482422,
        "IoU.seat": 0.5522999954223633,
        "IoU.fence": 0.3502000045776367,
        "IoU.desk": 0.39990001678466797,
        "IoU.rock": 0.3997000122070313,
        "IoU.wardrobe": 0.4111000061035156,
        "IoU.lamp": 0.5163000106811524,
        "IoU.bathtub": 0.6516999816894531,
        "IoU.railing": 0.2928000068664551,
        "IoU.cushion": 0.4331000137329102,
        "IoU.base": 0.21870000839233397,
        "IoU.box": 0.12720000267028808,
        "IoU.column": 0.4234000015258789,
        "IoU.signboard": 0.2957999992370606,
        "IoU.chest of drawers": 0.40240001678466797,
        "IoU.counter": 0.2545000076293945,
        "IoU.sand": 0.25739999771118166,
        "IoU.sink": 0.6093999862670898,
        "IoU.skyscraper": 0.5408000183105469,
        "IoU.fireplace": 0.6066999816894532,
        "IoU.refrigerator": 0.5993000030517578,
        "IoU.grandstand": 0.41639999389648436,
        "IoU.path": 0.23340000152587892,
        "IoU.stairs": 0.26899999618530274,
        "IoU.runway": 0.6816999816894531,
        "IoU.case": 0.47069999694824216,
        "IoU.pool table": 0.8627999877929687,
        "IoU.pillow": 0.44599998474121094,
        "IoU.screen door": 0.5118000030517578,
        "IoU.stairway": 0.29100000381469726,
        "IoU.river": 0.0965999984741211,
        "IoU.bridge": 0.37810001373291013,
        "IoU.bookcase": 0.33549999237060546,
        "IoU.blind": 0.3365000152587891,
        "IoU.coffee table": 0.45799999237060546,
        "IoU.toilet": 0.7277999877929687,
        "IoU.flower": 0.30700000762939456,
        "IoU.book": 0.40040000915527346,
        "IoU.hill": 0.061100001335144045,
        "IoU.bench": 0.3518000030517578,
        "IoU.countertop": 0.4545000076293945,
        "IoU.stove": 0.5797999954223633,
        "IoU.palm": 0.44490001678466795,
        "IoU.kitchen island": 0.21959999084472656,
        "IoU.computer": 0.5008000183105469,
        "IoU.swivel chair": 0.3611000061035156,
        "IoU.boat": 0.6530999755859375,
        "IoU.bar": 0.23090000152587892,
        "IoU.arcade machine": 0.455099983215332,
        "IoU.hovel": 0.5133000183105468,
        "IoU.bus": 0.7019999694824218,
        "IoU.towel": 0.4581999969482422,
        "IoU.light": 0.33860000610351565,
        "IoU.truck": 0.22510000228881835,
        "IoU.tower": 0.4729000091552734,
        "IoU.chandelier": 0.5947000122070313,
        "IoU.awning": 0.1815999984741211,
        "IoU.streetlight": 0.15640000343322755,
        "IoU.booth": 0.24860000610351562,
        "IoU.television receiver": 0.5845000076293946,
        "IoU.airplane": 0.5004000091552734,
        "IoU.dirt track": 0.018700000047683716,
        "IoU.apparel": 0.28170000076293944,
        "IoU.pole": 0.09680000305175782,
        "IoU.land": 0.07260000228881835,
        "IoU.bannister": 0.07170000076293945,
        "IoU.escalator": 0.3429000091552734,
        "IoU.ottoman": 0.2836000061035156,
        "IoU.bottle": 0.13829999923706054,
        "IoU.buffet": 0.29739999771118164,
        "IoU.poster": 0.1443000030517578,
        "IoU.stage": 0.04199999809265137,
        "IoU.van": 0.31510000228881835,
        "IoU.ship": 0.20579999923706055,
        "IoU.fountain": 0.008999999761581421,
        "IoU.conveyer belt": 0.6286999893188476,
        "IoU.canopy": 0.118100004196167,
        "IoU.washer": 0.6141999816894531,
        "IoU.plaything": 0.1647999954223633,
        "IoU.swimming pool": 0.3170000076293945,
        "IoU.stool": 0.17389999389648436,
        "IoU.barrel": 0.07260000228881835,
        "IoU.basket": 0.20059999465942382,
        "IoU.waterfall": 0.590900001525879,
        "IoU.tent": 0.9519000244140625,
        "IoU.bag": 0.03279999971389771,
        "IoU.minibike": 0.48520000457763673,
        "IoU.cradle": 0.6916000366210937,
        "IoU.oven": 0.19579999923706054,
        "IoU.ball": 0.44919998168945313,
        "IoU.food": 0.5677999877929687,
        "IoU.step": 0.009700000286102295,
        "IoU.tank": 0.19309999465942382,
        "IoU.trade name": 0.173700008392334,
        "IoU.microwave": 0.33180000305175783,
        "IoU.pot": 0.2625,
        "IoU.animal": 0.58,
        "IoU.bicycle": 0.41529998779296873,
        "IoU.lake": 0.0625,
        "IoU.dishwasher": 0.42189998626708985,
        "IoU.screen": 0.555,
        "IoU.blanket": 0.016399999856948854,
        "IoU.sculpture": 0.313700008392334,
        "IoU.hood": 0.3725,
        "IoU.sconce": 0.146899995803833,
        "IoU.vase": 0.24030000686645508,
        "IoU.traffic light": 0.12489999771118164,
        "IoU.tray": 0.0056999999284744265,
        "IoU.ashcan": 0.2915999984741211,
        "IoU.fan": 0.4765999984741211,
        "IoU.pier": 0.4829999923706055,
        "IoU.crt screen": 0.03130000114440918,
        "IoU.plate": 0.3902000045776367,
        "IoU.monitor": 0.037100000381469725,
        "IoU.bulletin board": 0.3320000076293945,
        "IoU.shower": 0.0,
        "IoU.radiator": 0.42619998931884767,
        "IoU.glass": 0.05260000228881836,
        "IoU.clock": 0.1518000030517578,
        "IoU.flag": 0.2221999931335449,
        "Acc.wall": 0.8626000213623047,
        "Acc.building": 0.9127999877929688,
        "Acc.sky": 0.9704000091552735,
        "Acc.floor": 0.8887999725341796,
        "Acc.tree": 0.8784999847412109,
        "Acc.ceiling": 0.8973999786376953,
        "Acc.road": 0.8702999877929688,
        "Acc.bed ": 0.9338999938964844,
        "Acc.windowpane": 0.7416000366210938,
        "Acc.grass": 0.8504000091552735,
        "Acc.cabinet": 0.6558000183105469,
        "Acc.sidewalk": 0.7759999847412109,
        "Acc.person": 0.9019000244140625,
        "Acc.earth": 0.4659000015258789,
        "Acc.door": 0.4518999862670898,
        "Acc.table": 0.6612000274658203,
        "Acc.mountain": 0.7533000183105468,
        "Acc.plant": 0.5856999969482422,
        "Acc.curtain": 0.831500015258789,
        "Acc.chair": 0.6184000015258789,
        "Acc.car": 0.893499984741211,
        "Acc.water": 0.6462999725341797,
        "Acc.painting": 0.8241999816894531,
        "Acc.sofa": 0.7515000152587891,
        "Acc.shelf": 0.5681000137329102,
        "Acc.house": 0.5572999954223633,
        "Acc.sea": 0.7655000305175781,
        "Acc.mirror": 0.586500015258789,
        "Acc.rug": 0.595,
        "Acc.field": 0.4231999969482422,
        "Acc.armchair": 0.45990001678466796,
        "Acc.seat": 0.7380999755859375,
        "Acc.fence": 0.48150001525878905,
        "Acc.desk": 0.5991999816894531,
        "Acc.rock": 0.6040000152587891,
        "Acc.wardrobe": 0.645,
        "Acc.lamp": 0.6594000244140625,
        "Acc.bathtub": 0.7575,
        "Acc.railing": 0.40490001678466797,
        "Acc.cushion": 0.5472999954223633,
        "Acc.base": 0.35639999389648436,
        "Acc.box": 0.18600000381469728,
        "Acc.column": 0.5604000091552734,
        "Acc.signboard": 0.40630001068115235,
        "Acc.chest of drawers": 0.5597000122070312,
        "Acc.counter": 0.33049999237060546,
        "Acc.sand": 0.4638999938964844,
        "Acc.sink": 0.7030999755859375,
        "Acc.skyscraper": 0.6627999877929688,
        "Acc.fireplace": 0.7643000030517578,
        "Acc.refrigerator": 0.7184999847412109,
        "Acc.grandstand": 0.6483999633789063,
        "Acc.path": 0.36400001525878906,
        "Acc.stairs": 0.34810001373291016,
        "Acc.runway": 0.9026999664306641,
        "Acc.case": 0.652300033569336,
        "Acc.pool table": 0.9526000213623047,
        "Acc.pillow": 0.5706999969482421,
        "Acc.screen door": 0.763499984741211,
        "Acc.stairway": 0.39169998168945314,
        "Acc.river": 0.15069999694824218,
        "Acc.bridge": 0.4402000045776367,
        "Acc.bookcase": 0.49,
        "Acc.blind": 0.37970001220703126,
        "Acc.coffee table": 0.7284999847412109,
        "Acc.toilet": 0.8744999694824219,
        "Acc.flower": 0.41220001220703123,
        "Acc.book": 0.6084999847412109,
        "Acc.hill": 0.08260000228881836,
        "Acc.bench": 0.4829999923706055,
        "Acc.countertop": 0.6377000045776368,
        "Acc.stove": 0.7225,
        "Acc.palm": 0.5758000183105468,
        "Acc.kitchen island": 0.4759999847412109,
        "Acc.computer": 0.634900016784668,
        "Acc.swivel chair": 0.49990001678466794,
        "Acc.boat": 0.7905000305175781,
        "Acc.bar": 0.2902000045776367,
        "Acc.arcade machine": 0.46830001831054685,
        "Acc.hovel": 0.6544000244140625,
        "Acc.bus": 0.8451999664306641,
        "Acc.towel": 0.5629999923706055,
        "Acc.light": 0.37869998931884763,
        "Acc.truck": 0.30020000457763674,
        "Acc.tower": 0.6433000183105468,
        "Acc.chandelier": 0.7401000213623047,
        "Acc.awning": 0.22799999237060548,
        "Acc.streetlight": 0.19899999618530273,
        "Acc.booth": 0.38799999237060545,
        "Acc.television receiver": 0.7548000335693359,
        "Acc.airplane": 0.6031999969482422,
        "Acc.dirt track": 0.045900001525878906,
        "Acc.apparel": 0.40330001831054685,
        "Acc.pole": 0.12,
        "Acc.land": 0.1315999984741211,
        "Acc.bannister": 0.10680000305175781,
        "Acc.escalator": 0.5161999893188477,
        "Acc.ottoman": 0.41900001525878905,
        "Acc.bottle": 0.15989999771118163,
        "Acc.buffet": 0.3497999954223633,
        "Acc.poster": 0.17440000534057618,
        "Acc.stage": 0.08460000038146973,
        "Acc.van": 0.39439998626708983,
        "Acc.ship": 0.21520000457763672,
        "Acc.fountain": 0.008999999761581421,
        "Acc.conveyer belt": 0.8201000213623046,
        "Acc.canopy": 0.15289999961853026,
        "Acc.washer": 0.6916000366210937,
        "Acc.plaything": 0.28319999694824216,
        "Acc.swimming pool": 0.3927000045776367,
        "Acc.stool": 0.21760000228881837,
        "Acc.barrel": 0.6620999908447266,
        "Acc.basket": 0.3035000038146973,
        "Acc.waterfall": 0.7,
        "Acc.tent": 0.9720999908447265,
        "Acc.bag": 0.03650000095367432,
        "Acc.minibike": 0.5818000030517578,
        "Acc.cradle": 0.897300033569336,
        "Acc.oven": 0.3895000076293945,
        "Acc.ball": 0.6133000183105469,
        "Acc.food": 0.7027999877929687,
        "Acc.step": 0.01059999942779541,
        "Acc.tank": 0.19739999771118164,
        "Acc.trade name": 0.19610000610351563,
        "Acc.microwave": 0.3802000045776367,
        "Acc.pot": 0.31190000534057616,
        "Acc.animal": 0.6163000106811524,
        "Acc.bicycle": 0.6181999969482422,
        "Acc.lake": 0.07440000057220458,
        "Acc.dishwasher": 0.5559999847412109,
        "Acc.screen": 0.7576000213623046,
        "Acc.blanket": 0.017200000286102295,
        "Acc.sculpture": 0.4109000015258789,
        "Acc.hood": 0.4022999954223633,
        "Acc.sconce": 0.16760000228881836,
        "Acc.vase": 0.3618000030517578,
        "Acc.traffic light": 0.22139999389648438,
        "Acc.tray": 0.00699999988079071,
        "Acc.ashcan": 0.37650001525878907,
        "Acc.fan": 0.5984999847412109,
        "Acc.pier": 0.605999984741211,
        "Acc.crt screen": 0.09609999656677246,
        "Acc.plate": 0.4872999954223633,
        "Acc.monitor": 0.04130000114440918,
        "Acc.bulletin board": 0.417400016784668,
        "Acc.shower": 0.0,
        "Acc.radiator": 0.4936999893188477,
        "Acc.glass": 0.05570000171661377,
        "Acc.clock": 0.17290000915527343,
        "Acc.flag": 0.23879999160766602
    }

@CLAassistant
Copy link

CLAassistant commented Oct 31, 2022

CLA assistant check
All committers have signed the CLA.

@MengzhangLI
Copy link
Contributor

MengzhangLI commented Oct 31, 2022

Guten tag @FabianSchuetze. Thanks for your nice PR.

Please sign CLA liscense first, we would review it ASAP.

Best,

P.S. We absolutely welcome your excellent contribution and actually in our developing plan, SegNeXt is priority model in next few month.

@codecov
Copy link

codecov bot commented Oct 31, 2022

Codecov Report

Base: 88.97% // Head: 86.87% // Decreases project coverage by -2.10% ⚠️

Coverage data is based on head (b8c6aaa) compared to base (b42c487).
Patch coverage: 22.53% of modified lines in pull request are covered.

❗ Current head b8c6aaa differs from pull request most recent head 319dbda. Consider uploading reports for the commit 319dbda to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2247      +/-   ##
==========================================
- Coverage   88.97%   86.87%   -2.10%     
==========================================
  Files         145      148       +3     
  Lines        8735     9030     +295     
  Branches     1473     1506      +33     
==========================================
+ Hits         7772     7845      +73     
- Misses        720      942     +222     
  Partials      243      243              
Flag Coverage Δ
unittests 86.87% <22.53%> (-2.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmseg/models/backbones/mscan.py 21.95% <21.95%> (ø)
mmseg/models/decode_heads/ham_head.py 22.03% <22.03%> (ø)
mmseg/models/backbones/__init__.py 100.00% <100.00%> (ø)
mmseg/models/decode_heads/__init__.py 100.00% <100.00%> (ø)
mmseg/datasets/__init__.py 100.00% <0.00%> (ø)
mmseg/datasets/face.py 80.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@FabianSchuetze
Copy link
Author

Hi @MengzhangLI - thank you so much for your kind reply! Happy to hear mmsegmentation wants to include SegNext.

I signed the CLA and also added appropriate attribution to SegNext in the two model files - hope that's OK.

Looking forward to the review and the discussion about the empirical results and the model.

@MengzhangLI
Copy link
Contributor

Guten abend, Fabian:

Q1: "I used their ImageNet pretrained from the Tsinghua cloud. I cannot do a pretraining from scratch. Is it OK to use the official weights?"

A1: Do you mean you want to train from scratch or you meet some problems when using pretrain models from Tsinghua cloud? If you want to train from scratch, just need to set init_cfg=None.

Q2: "I used one GPU, (not 8 as in the paper), but used the same settings. Should the learning rate be scaled then?"

A2: Theoretically, no difference if total batch sizes are the same. If only using one GPU, the samplers per GPU should be 8 times larger than original settings.

Q3: "In contrast to the configs of the original repo, I did not use a RepeatDataset of size 50, but the conventional ADE dataset config because of a copy-and-paste glitch."

A3: "RepeatDataset" is used in some models which use MMSegmentation as their framework, such as SegFormer, PoolFormer. The difference between w/o and w should be very small or little. Please check this issue.

@MengzhangLI MengzhangLI changed the title Contribute Segnext [Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. Nov 2, 2022
@MengzhangLI
Copy link
Contributor

Hi, Fabian, could you grant authorization on me about your forked MMSegmentation followed here? Thus I could push my modifications on your branch.

@FabianSchuetze
Copy link
Author

Hi Li! Sure, how wonderful to see your additions. You should have received an invite.

Thanks also for the answers to the questions. I have also asked MenghaoGuo for clarification because I cannot reconcile the configs in their repo with the description in the paper. I will try training the model again very soon.

P.S: I hope I selected you first name correctly :-). Sometimes I get confused with the ordering of Chinese names.

Comment on lines 84 to 95
self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim)
self.conv0_1 = nn.Conv2d(dim, dim, (1, 7), padding=(0, 3), groups=dim)
self.conv0_2 = nn.Conv2d(dim, dim, (7, 1), padding=(3, 0), groups=dim)

self.conv1_1 = nn.Conv2d(dim, dim, (1, 11), padding=(0, 5), groups=dim)
self.conv1_2 = nn.Conv2d(dim, dim, (11, 1), padding=(5, 0), groups=dim)

self.conv2_1 = nn.Conv2d(
dim, dim, (1, 21), padding=(0, 10), groups=dim)
self.conv2_2 = nn.Conv2d(
dim, dim, (21, 1), padding=(10, 0), groups=dim)
self.conv3 = nn.Conv2d(dim, dim, 1)
Copy link
Contributor

@MengzhangLI MengzhangLI Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would better extract these hard code convolution dimensions (i.e., 1x7, 1x11, 1x21) of MSCA into config file.

@MengzhangLI
Copy link
Contributor

Also, MMSeg-Style Docstring would be added.

Comment on lines 32 to 40
print('spatial', self.spatial)
print('S', self.S)
print('D', self.D)
print('R', self.R)
print('train_steps', self.train_steps)
print('eval_steps', self.eval_steps)
print('inv_t', self.inv_t)
print('eta', self.eta)
print('rand_init', self.rand_init)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may delete these lines and add these args as parameters in config files.

Comment on lines 32 to 54
# optimizer
optimizer = dict(
_delete_=True,
type='AdamW',
lr=0.00006,
betas=(0.9, 0.999),
weight_decay=0.01,
paramwise_cfg=dict(
custom_keys={
'pos_block': dict(decay_mult=0.),
'norm': dict(decay_mult=0.),
'head': dict(lr_mult=10.)
}))

lr_config = dict(
_delete_=True,
policy='poly',
warmup='linear',
warmup_iters=1500,
warmup_ratio=1e-6,
power=1.0,
min_lr=0.0,
by_epoch=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are redundant which repeats below.

@FabianSchuetze
Copy link
Author

Thanks for all the comments so far, @MengzhangLI! I will incorporate them tomorrow.

@FabianSchuetze
Copy link
Author

Thanks for the three comments, @MengzhangLI . My last commit contains that following changes:

  • Renamed the layers in AttentionModule to be more consistent with the naming in the paper.
  • Removed the duplicated config files and the _delete_ values in the config files.
  • Renamed the model config from mscan to segnext to reflect that the config includes both backbone and head
  • Lifted the hardcoded values from the implementation to the config files. Not sure if the naming is OK.

Can you explain me what you mean by saying "MMSeg-Style Docstring would be added"?

@MengzhangLI
Copy link
Contributor

Sorry for my misleading words, the so-called 'mmseg-style' docstring is our default docstring in files, such as forward function which has args and returns explanations and class which has explanations for each args. However, adding docstrings are not our priority in this moment. If you don't mind, I can add docstrings in the next week.

Best,

@FabianSchuetze
Copy link
Author

The first results came in: At commit c56d243, the results for the tiny model on ADE20k were 41.62 mIoU (SS), slightly above the ref mIoU of 41.1.

I used one GPU and a batch size of 16 (as the authors did), instead of the 4x2 default. I do not have access to 4 GPUs. Should I change and train with a bs of 8 instead to be more conformable to the default?

@MengzhangLI
Copy link
Contributor

Thanks for your feedback. I think your setting is correct, because the batch size of ADE20K is 16 rather than 8. The total batch size should match up with original paper.

return x, H, W


class AttentionModule(BaseModule):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replace the name convx with scalex in class AttentionModule(BaseModule)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. Maybe you are right and I should revert to the original name. I converted the name to be more in line with the description of the paper (equation (1)) but I guess its preferable to revert it to covnx?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came to conclude that renaming wasn't such a great idea as it precludes loading the pre-trained weights from the original source. I will revert the changes again.

@MengzhangLI
Copy link
Contributor

MengzhangLI commented Nov 7, 2022

Could you attach your training log like 2022110xxxxx.log to let me have a look? I am trying to re-implement results. Thanks in advance.

2022-11-09 Update

Currently results are below, still existing some gaps.

Model Original Repo Results Original Repo reimplementing MMSeg PR
MSCAN-T 41.1 39.44 TBD
MSCAN-S 44.3 42.34 43.13
MSCAN-B 48.5 46.74 47.63
MSCAN-L 51.0  49.14 49.04

Also, I noticed if we modified BN to SyncBN here, the model performance would be very bad.

image

image

I would figure it out in next few days.

Best,

@FabianSchuetze
Copy link
Author

Thanks for the very insightful comment, @MengzhangLI!

Here is the final part of the log for the training run : 20221104_183356.log. I needed to train in a few steps because my access to the server is cutoff after six hours. The first part is here: 20221104_183356.log The single scale eval results come to: 0.41619999999999996.

Gee, the results about SyncBatchNorm are very interesting. However, the authors said they used 2 or 4 GPUs to train the models, so they must have used SyncBN?

@MeowZheng MeowZheng added the High Priority from Community This issue/pr needs more attention and higher priority than default developing plan label Jan 3, 2023
xiexinch and others added 19 commits January 11, 2023 17:39
…2480)

## Motivation

Based on the ImageNet dataset, we propose the ImageNet-S dataset has 1.2 million training images and 50k high-quality semantic segmentation annotations to support unsupervised/semi-supervised semantic segmentation on the ImageNet dataset.

paper:
Large-scale Unsupervised Semantic Segmentation (TPAMI 2022)
[Paper link](https://arxiv.org/abs/2106.03149)

## Modification

1. Support imagenet-s dataset and its' configuration
2. Add the dataset preparation in the documentation
…pen-mmlab#2500)

## Motivation

I want to fix a bug through this PR. The bug occurs when two options --
`reduce_zero_label=True`, and custom classes are used.
`reduce_zero_label` remaps the GT seg labels by remapping the zero-class
to 255 which is ignored. Conceptually, this should occur *before* the
`label_map` is applied, which maps *already reduced labels*. However,
currently, the `label_map` is applied before the zero label is reduced.

## Modification

The modification is simple:
- I've just interchanged the order of the two operations by moving 4
lines from bottom to top.
- I've added a test that passes when the fix is introduced, and fails on
the original `master` branch.

## BC-breaking (Optional)

I do not anticipate this change braking any backward-compatibility.

## Checklist

- [x] Pre-commit or other linting tools are used to fix the potential
lint issues.
  - _I've fixed all linting/pre-commit errors._
- [x] The modification is covered by complete unit tests. If not, please
add more unit test to ensure the correctness.
  - _I've added a unit test._ 
- [x] If the modification has potential influence on downstream
projects, this PR should be tested with downstream projects, like MMDet
or MMDet3D.
  - _I don't think this change affects MMDet or MMDet3D._
- [x] The documentation has been modified accordingly, like docstring or
example tutorials.
- _This change fixes an existing bug and doesn't require modifying any
documentation/docstring._
## Motivation

This fixes open-mmlab#2493. When the `label_map` is created, the index for ignored
classes was being set to -1, whereas the index that is actually ignored
is 255. This worked indirectly since -1 was underflowed to 255 when
converting to uint8.

The same fix was made in the 1.x by open-mmlab#2332 but this fix was never made to
`master`.

## Modification

The only small modification is setting the index of ignored classes to
255 instead of -1.

## Checklist

- [x] Pre-commit or other linting tools are used to fix the potential
lint issues.
  - _I've fixed all linting/pre-commit errors._
- [x] The modification is covered by complete unit tests. If not, please
add more unit test to ensure the correctness.
- _No unit tests need to be added. Unit tests that are affected were
modified.
- [x] If the modification has potential influence on downstream
projects, this PR should be tested with downstream projects, like MMDet
or MMDet3D.
  - _I don't think this change affects MMDet or MMDet3D._
- [x] The documentation has been modified accordingly, like docstring or
example tutorials.
- _This change fixes an existing bug and doesn't require modifying any
documentation/docstring._
…pen-mmlab#2520)

## Motivation

open-mmlab/mmeval#85

---------

Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
## Motivation

Through this PR, I (1) fix a bug, and (2) perform some associated cleanup, and (3) add a unit test. The bug occurs during evaluation when two options -- `reduce_zero_label=True`, and custom classes are used. The bug was that the `reduce_zero_label` is not properly propagated (see details below).

## Modification

1. **Bugfix**

The bug occurs [in the initialization of `CustomDataset`](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L108-L110) where the `reduce_zero_label` flag is not propagated to its member `self.gt_seg_map_loader_cfg`:

```python
self.gt_seg_map_loader = LoadAnnotations(
) if gt_seg_map_loader_cfg is None else LoadAnnotations(
    **gt_seg_map_loader_cfg)
```

Because the `reduce_zero_label` flag was not being propagated, the zero label reduction was being [unnecessarily and explicitly duplicated during the evaluation](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/core/evaluation/metrics.py#L66-L69).

As pointed in a previous PR (open-mmlab#2500), `reduce_zero_label` must occur before applying the `label_map`. Due to this bug, the order gets reversed when both features are used simultaneously.

This has been fixed to:

```python
self.gt_seg_map_loader = LoadAnnotations(
    reduce_zero_label=reduce_zero_label, **gt_seg_map_loader_cfg)
```

2. **Cleanup**

Due to the bug fix, since both `reduce_zero_label` and `label_map` are being applied in `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`), the evaluation does not need perform them anymore. However, for backwards compatibility, the evaluation keeps previous input arguments.

This was pointed out for `label_map` in a previous issue (open-mmlab#1415) that the `label_map` should  not be applied in the evaluation. This was handled by [passing an empty dict](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L306-L311):

```python
# as the labels has been converted when dataset initialized
# in `get_palette_for_custom_classes ` this `label_map`
# should be `dict()`, see
# open-mmlab#1415
# for more ditails
label_map=dict(),
reduce_zero_label=self.reduce_zero_label))
```

Similar to this, I now also set `reduce_label=False` since it is now also being handled by `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`).

3. **Unit test**

I've added a unit test that tests the `CustomDataset.pre_eval()` function when `reduce_zero_label=True` and custom classes are used. The test fails on the original `master` branch but passes with this fix.

## BC-breaking (Optional)

I do not anticipate this change braking any backward-compatibility.

## Checklist

- [x] Pre-commit or other linting tools are used to fix the potential lint issues.
  - _I've fixed all linting/pre-commit errors._
- [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  - _I've added a test that passes when the fix is introduced, and fails on the original master branch._
- [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  - _I don't think this change affects MMDet or MMDet3D._
- [x] The documentation has been modified accordingly, like docstring or example tutorials.
  - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._
@FabianSchuetze
Copy link
Author

It's so nice to see that you made some progress, @MengzhangLI ! Do you want me to test anything?

@MengzhangLI
Copy link
Contributor

Hi, @FabianSchuetze Sorry for late reply.
I create a new PR: #2600 because this PR has too many commits and changed files, but we would prior merge this PR if we could solve this question.

Let me summarize what we discussed several months ago:

The problem is caused by PyTorch 1.9 version + SyncBN on mscan.py, in other PyTorch version SyncBN is OK. So, we use BN on single GPU as default config setting. So for everyone, it is OK to use SyncBN with 2 samplers per GPU with 8 GPUs when pytorch version is not 1.9.

Best,

MeowZheng added a commit that referenced this pull request Feb 24, 2023
## Motivation

Support SegNeXt.

Due to many commits & changed files caused by WIP too long (perhaps it
could be resolved by `git merge` or `git rebase`).

This PR is created only for backup of old PR
#2247

Co-authored-by: MeowZheng <meowzheng@outlook.com>
Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
@MengzhangLI MengzhangLI removed the WIP Work in process label Feb 24, 2023
@FabianSchuetze
Copy link
Author

Wonderful!

@OpenMMLab-Assistant001
Copy link

Hi @FabianSchuetze !We are grateful for your efforts in helping improve this open-source project during your personal time.

Welcome to join OpenMMLab Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us :https://discord.gg/UjgXkPWNqA
If you have a WeChat account,welcome to join our community on WeChat. You can add our assistant :openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends:)
Thank you again for your contribution❤

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algorithm Improvement or addition of new algorithm model High Priority from Community This issue/pr needs more attention and higher priority than default developing plan
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants