-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support SegNeXt (NeurIPS'2022) in MMSeg 0.x. #2247
Conversation
Guten tag @FabianSchuetze. Thanks for your nice PR. Please sign CLA liscense first, we would review it ASAP. Best, P.S. We absolutely welcome your excellent contribution and actually in our developing plan, SegNeXt is priority model in next few month. |
Codecov ReportBase: 88.97% // Head: 86.87% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #2247 +/- ##
==========================================
- Coverage 88.97% 86.87% -2.10%
==========================================
Files 145 148 +3
Lines 8735 9030 +295
Branches 1473 1506 +33
==========================================
+ Hits 7772 7845 +73
- Misses 720 942 +222
Partials 243 243
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Hi @MengzhangLI - thank you so much for your kind reply! Happy to hear mmsegmentation wants to include SegNext. I signed the CLA and also added appropriate attribution to SegNext in the two model files - hope that's OK. Looking forward to the review and the discussion about the empirical results and the model. |
Guten abend, Fabian: Q1: "I used their ImageNet pretrained from the Tsinghua cloud. I cannot do a pretraining from scratch. Is it OK to use the official weights?" A1: Do you mean you want to train from scratch or you meet some problems when using pretrain models from Tsinghua cloud? If you want to train from scratch, just need to set Q2: "I used one GPU, (not 8 as in the paper), but used the same settings. Should the learning rate be scaled then?" A2: Theoretically, no difference if total batch sizes are the same. If only using one GPU, the samplers per GPU should be 8 times larger than original settings. Q3: "In contrast to the configs of the original repo, I did not use a RepeatDataset of size 50, but the conventional ADE dataset config because of a copy-and-paste glitch." A3: "RepeatDataset" is used in some models which use MMSegmentation as their framework, such as SegFormer, PoolFormer. The difference between w/o and w should be very small or little. Please check this issue. |
Hi, Fabian, could you grant authorization on me about your forked MMSegmentation followed here? Thus I could push my modifications on your branch. |
Hi Li! Sure, how wonderful to see your additions. You should have received an invite. Thanks also for the answers to the questions. I have also asked MenghaoGuo for clarification because I cannot reconcile the configs in their repo with the description in the paper. I will try training the model again very soon. P.S: I hope I selected you first name correctly :-). Sometimes I get confused with the ordering of Chinese names. |
mmseg/models/backbones/mscan.py
Outdated
self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim) | ||
self.conv0_1 = nn.Conv2d(dim, dim, (1, 7), padding=(0, 3), groups=dim) | ||
self.conv0_2 = nn.Conv2d(dim, dim, (7, 1), padding=(3, 0), groups=dim) | ||
|
||
self.conv1_1 = nn.Conv2d(dim, dim, (1, 11), padding=(0, 5), groups=dim) | ||
self.conv1_2 = nn.Conv2d(dim, dim, (11, 1), padding=(5, 0), groups=dim) | ||
|
||
self.conv2_1 = nn.Conv2d( | ||
dim, dim, (1, 21), padding=(0, 10), groups=dim) | ||
self.conv2_2 = nn.Conv2d( | ||
dim, dim, (21, 1), padding=(10, 0), groups=dim) | ||
self.conv3 = nn.Conv2d(dim, dim, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would better extract these hard code convolution dimensions (i.e., 1x7, 1x11, 1x21) of MSCA into config file.
Also, MMSeg-Style Docstring would be added. |
print('spatial', self.spatial) | ||
print('S', self.S) | ||
print('D', self.D) | ||
print('R', self.R) | ||
print('train_steps', self.train_steps) | ||
print('eval_steps', self.eval_steps) | ||
print('inv_t', self.inv_t) | ||
print('eta', self.eta) | ||
print('rand_init', self.rand_init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may delete these lines and add these args as parameters in config files.
# optimizer | ||
optimizer = dict( | ||
_delete_=True, | ||
type='AdamW', | ||
lr=0.00006, | ||
betas=(0.9, 0.999), | ||
weight_decay=0.01, | ||
paramwise_cfg=dict( | ||
custom_keys={ | ||
'pos_block': dict(decay_mult=0.), | ||
'norm': dict(decay_mult=0.), | ||
'head': dict(lr_mult=10.) | ||
})) | ||
|
||
lr_config = dict( | ||
_delete_=True, | ||
policy='poly', | ||
warmup='linear', | ||
warmup_iters=1500, | ||
warmup_ratio=1e-6, | ||
power=1.0, | ||
min_lr=0.0, | ||
by_epoch=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are redundant which repeats below.
Thanks for all the comments so far, @MengzhangLI! I will incorporate them tomorrow. |
Thanks for the three comments, @MengzhangLI . My last commit contains that following changes:
Can you explain me what you mean by saying "MMSeg-Style Docstring would be added"? |
Sorry for my misleading words, the so-called 'mmseg-style' docstring is our default docstring in files, such as forward function which has Best, |
The first results came in: At commit c56d243, the results for the tiny model on ADE20k were 41.62 mIoU (SS), slightly above the ref mIoU of 41.1. I used one GPU and a batch size of 16 (as the authors did), instead of the 4x2 default. I do not have access to 4 GPUs. Should I change and train with a bs of 8 instead to be more conformable to the default? |
Thanks for your feedback. I think your setting is correct, because the batch size of ADE20K is 16 rather than 8. The total batch size should match up with original paper. |
return x, H, W | ||
|
||
|
||
class AttentionModule(BaseModule): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why replace the name convx
with scalex
in class AttentionModule(BaseModule)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. Maybe you are right and I should revert to the original name. I converted the name to be more in line with the description of the paper (equation (1)) but I guess its preferable to revert it to covnx
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came to conclude that renaming wasn't such a great idea as it precludes loading the pre-trained weights from the original source. I will revert the changes again.
Could you attach your training log like 2022-11-09 UpdateCurrently results are below, still existing some gaps.
Also, I noticed if we modified
|
Thanks for the very insightful comment, @MengzhangLI! Here is the final part of the log for the training run : 20221104_183356.log. I needed to train in a few steps because my access to the server is cutoff after six hours. The first part is here: 20221104_183356.log The single scale eval results come to: 0.41619999999999996. Gee, the results about SyncBatchNorm are very interesting. However, the authors said they used 2 or 4 GPUs to train the models, so they must have used SyncBN? |
…2480) ## Motivation Based on the ImageNet dataset, we propose the ImageNet-S dataset has 1.2 million training images and 50k high-quality semantic segmentation annotations to support unsupervised/semi-supervised semantic segmentation on the ImageNet dataset. paper: Large-scale Unsupervised Semantic Segmentation (TPAMI 2022) [Paper link](https://arxiv.org/abs/2106.03149) ## Modification 1. Support imagenet-s dataset and its' configuration 2. Add the dataset preparation in the documentation
…pen-mmlab#2500) ## Motivation I want to fix a bug through this PR. The bug occurs when two options -- `reduce_zero_label=True`, and custom classes are used. `reduce_zero_label` remaps the GT seg labels by remapping the zero-class to 255 which is ignored. Conceptually, this should occur *before* the `label_map` is applied, which maps *already reduced labels*. However, currently, the `label_map` is applied before the zero label is reduced. ## Modification The modification is simple: - I've just interchanged the order of the two operations by moving 4 lines from bottom to top. - I've added a test that passes when the fix is introduced, and fails on the original `master` branch. ## BC-breaking (Optional) I do not anticipate this change braking any backward-compatibility. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _I've added a unit test._ - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._
## Motivation This fixes open-mmlab#2493. When the `label_map` is created, the index for ignored classes was being set to -1, whereas the index that is actually ignored is 255. This worked indirectly since -1 was underflowed to 255 when converting to uint8. The same fix was made in the 1.x by open-mmlab#2332 but this fix was never made to `master`. ## Modification The only small modification is setting the index of ignored classes to 255 instead of -1. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _No unit tests need to be added. Unit tests that are affected were modified. - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._
…pen-mmlab#2520) ## Motivation open-mmlab/mmeval#85 --------- Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
## Motivation Through this PR, I (1) fix a bug, and (2) perform some associated cleanup, and (3) add a unit test. The bug occurs during evaluation when two options -- `reduce_zero_label=True`, and custom classes are used. The bug was that the `reduce_zero_label` is not properly propagated (see details below). ## Modification 1. **Bugfix** The bug occurs [in the initialization of `CustomDataset`](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L108-L110) where the `reduce_zero_label` flag is not propagated to its member `self.gt_seg_map_loader_cfg`: ```python self.gt_seg_map_loader = LoadAnnotations( ) if gt_seg_map_loader_cfg is None else LoadAnnotations( **gt_seg_map_loader_cfg) ``` Because the `reduce_zero_label` flag was not being propagated, the zero label reduction was being [unnecessarily and explicitly duplicated during the evaluation](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/core/evaluation/metrics.py#L66-L69). As pointed in a previous PR (open-mmlab#2500), `reduce_zero_label` must occur before applying the `label_map`. Due to this bug, the order gets reversed when both features are used simultaneously. This has been fixed to: ```python self.gt_seg_map_loader = LoadAnnotations( reduce_zero_label=reduce_zero_label, **gt_seg_map_loader_cfg) ``` 2. **Cleanup** Due to the bug fix, since both `reduce_zero_label` and `label_map` are being applied in `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`), the evaluation does not need perform them anymore. However, for backwards compatibility, the evaluation keeps previous input arguments. This was pointed out for `label_map` in a previous issue (open-mmlab#1415) that the `label_map` should not be applied in the evaluation. This was handled by [passing an empty dict](https://github.com/open-mmlab/mmsegmentation/blob/5d49918b3c48df5544213562aa322bfa89d67ef1/mmseg/datasets/custom.py#L306-L311): ```python # as the labels has been converted when dataset initialized # in `get_palette_for_custom_classes ` this `label_map` # should be `dict()`, see # open-mmlab#1415 # for more ditails label_map=dict(), reduce_zero_label=self.reduce_zero_label)) ``` Similar to this, I now also set `reduce_label=False` since it is now also being handled by `get_gt_seg_map_by_idx()` (i.e. `LoadAnnotations.__call__()`). 3. **Unit test** I've added a unit test that tests the `CustomDataset.pre_eval()` function when `reduce_zero_label=True` and custom classes are used. The test fails on the original `master` branch but passes with this fix. ## BC-breaking (Optional) I do not anticipate this change braking any backward-compatibility. ## Checklist - [x] Pre-commit or other linting tools are used to fix the potential lint issues. - _I've fixed all linting/pre-commit errors._ - [x] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. - _I've added a test that passes when the fix is introduced, and fails on the original master branch._ - [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - _I don't think this change affects MMDet or MMDet3D._ - [x] The documentation has been modified accordingly, like docstring or example tutorials. - _This change fixes an existing bug and doesn't require modifying any documentation/docstring._
It's so nice to see that you made some progress, @MengzhangLI ! Do you want me to test anything? |
Hi, @FabianSchuetze Sorry for late reply. Let me summarize what we discussed several months ago: The problem is caused by PyTorch 1.9 version + SyncBN on Best, |
## Motivation Support SegNeXt. Due to many commits & changed files caused by WIP too long (perhaps it could be resolved by `git merge` or `git rebase`). This PR is created only for backup of old PR #2247 Co-authored-by: MeowZheng <meowzheng@outlook.com> Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
Wonderful! |
Hi @FabianSchuetze !We are grateful for your efforts in helping improve this open-source project during your personal time. Welcome to join OpenMMLab Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us :https://discord.gg/UjgXkPWNqA |
Update 31/01/2023
From now on we would pick up and support SegNeXt in master branch.
Here are our TO-DO-LIST:
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Hello,
thanks for the fantastic repo, it's a pleasure to work with mmsegmentation.
I would like to help contribute SegNext to mmsegmentation. I discussed this with @MenghaoGuo and he appreciates such a contribution. Does mmsegmentation also appreciate the contribution?
I trained the tiny model on ADE20k which results in a val (test) mIuO of 39.27 (39.17), compared to the test mIoU of 41.1 reported in the paper. A few noteworthy aspects are:
You can see the config logs at the end of the PR.
What would be the next steps for a contribution? I guess I should retrain the tiny network again after we agree on the correct settings for a one-GPU environment and then I run the experiment again. Do you have anything to add, @MenghaoGuo?
Best wishes,
Fabian
Modification
Contribute SegNext
BC-breaking (Optional)
n/a
Use cases (Optional)
n/a
Checklist
The pre-commit hooks passed, I guess the other ones are not applicable.
Config
Logs: