Add SwinV2 #6246

ain-soph · 2022-07-07T17:41:23Z

… into swin_transfomer_v2

ain-soph · 2022-07-07T19:33:31Z

And I see an issue from the Microsoft SwinTransformer repo: microsoft/Swin-Transformer#194

It thinks it's not necessary to divide the mask into 9 parts because 4 parts are already enough.
I kind of agree with that. Anyone has opinion on that?

ain-soph · 2022-07-08T02:39:43Z

It seems these mod operations are considered as nodes of the fx graph, and their output is int rather than torch.Tensor

Previous SwinTransformer V1 doesn't fail at these tests only because those mod operations are not sampled with that random seed.

While this test file requires it to be torch.Tensor

datumbox · 2022-07-08T11:26:09Z

@ain-soph Thanks a lot for your contribution!

Concerning the linter, if you check the CI on tab Required lint modifications it will show you what's the problem:

Concerning your point the PatchMerging, I believe you are right. FX doesn't let you use the input of the tensor to control the flow of the program. This needs to move outside of the main function and be declared a non fx-traceable operator. I would patch this ASAP outside of this PR. cc @YosuaMichael

datumbox

@ain-soph The PR looks in the right direction. I've added a few comments for your attention, please have a look.

Note that I'll be on PTO the next 2 weeks so @YosuaMichael offered to provide assistance. He is also a Meta maintainer so you can schedule a call if needed to speed this up.

Please note that my comments don't address the validity of the changes you do on the ML side but mostly on the idioms and coding practices we use at TorchVision. A more deep dive check would be necessary to confirm validity. The first step I think is to take the original pre-trained weights from the paper and confirm that we can load them in your implementation (by making some conversion) and then reproduce the reported accuracy using our reference scripts. Once we verify this, the next step is to train the tiny variant to show-case we can reproduce the accuracy. If you don't have access to a GPU cluster, we can help.

Again many thanks for your contribution and help on this.

cc @xiaohu2015 and @jdsgomes who worked on the original Swin implementation to see if they have any early feedback.

torchvision/models/swin_transformer.py

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

… into swin_transfomer_v2

YosuaMichael · 2022-07-08T16:49:11Z

Hi @ain-soph, thanks a lot for the PR!
As of now, I am still reading SwinTransformerV2 paper and original code, and I will try to review afterwards.

Meanwhile, let me address some of the issue you raise:

On ufmt issue, can you make sure you install the following version: pip install ufmt==1.3.2 black==21.9b0 usort==0.6.4 (reference)
For the fx issue, I create a small patch: Small Patch SwinTransformer for FX compatibility #6252

YosuaMichael

Hi @ain-soph , overall the PR looks good!
I agree with @datumbox comments that we may want to refactor some code so it can be reused in both V1 and V2 and I add some suggestion here.

torchvision/models/swin_transformer.py

… into swin_transfomer_v2

torchvision/models/swin_transformer.py

ain-soph · 2022-07-08T20:22:00Z

torchvision/models/swin_transformer.py

+        self.register_buffer("relative_position_index", relative_position_index)
+
+    def get_relative_position_bias(self) -> torch.Tensor:
+        relative_position_bias_table: torch.Tensor = self.cpb_mlp(self.relative_coords_table).view(-1, self.num_heads)


Shall we use flatten(end_dim=-2) here instead of view(-1, self.num_heads)?

ain-soph · 2022-08-04T00:59:53Z

I've launched another train based on the most up-to-date code for tiny architecture.
Currently I don't observe any strange trend showing failure of convergence.

Epoch: [142]  [2500/2502]  eta: 0:00:00  lr: 0.0006042917608127198  img/s: 266.4080904318139  loss: 3.6911 (3.6364)  acc1: 48.4375 (48.2548)  acc5: 73.4375 (71.9340)  time: 0.4798  data: 0.0001  max mem: 14880
Epoch: [142] Total time: 0:20:10
Test:   [ 0/98]  eta: 0:06:25  loss: 1.5864 (1.5864)  acc1: 82.0312 (82.0312)  acc5: 96.0938 (96.0938)  time: 3.9314  data: 3.7073  max mem: 14880
Test:  Total time: 0:00:25
Test:  Acc@1 71.494 Acc@5 91.496
Test: EMA  [ 0/98]  eta: 0:05:43  loss: 1.4705 (1.4705)  acc1: 88.2812 (88.2812)  acc5: 97.6562 (97.6562)  time: 3.5061  data: 3.2585  max mem: 14880
Test: EMA Total time: 0:00:25
Test: EMA Acc@1 76.420 Acc@5 93.660

datumbox · 2022-08-04T07:54:46Z

@ain-soph Could you please share the exact command you use to train it to ensure we are not missing anything from our side?

datumbox · 2022-08-04T07:58:27Z

Ooops, there are also memory issues on GPU both on linux and windows. I know that a different test is actually failing but this can happen. Sometimes adding a bing new model leads to issues on other models because of failing to clear the memory properly. Usually the way around this is to reduce the memory footprint of the test by passing thought smaller sizes for input or disabling the particular test on the GPU.

ain-soph · 2022-08-05T05:43:59Z

torchrun --nproc_per_node=4 train.py --model swin_v2_t --epochs 300 --batch-size 128 --opt adamw --lr 0.001 --weight-decay 0.05 --norm-weight-decay 0.0  --bias-weight-decay 0.0 --transformer-embedding-decay 0.0 --lr-scheduler cosineannealinglr --lr-min 0.00001 --lr-warmup-method linear  --lr-warmup-epochs 20 --lr-warmup-decay 0.01 --amp --label-smoothing 0.1 --mixup-alpha 0.8 --clip-grad-norm 5.0 --cutmix-alpha 1.0 --random-erase 0.25 --interpolation bicubic --auto-augment ta_wide --model-ema --ra-sampler --ra-reps 4 --val-resize-size 256 --val-crop-size 256 --train-crop-size 256

Epoch: [221] Total time: 0:25:25
Test:   [ 0/98]  eta: 0:06:23  loss: 1.3831 (1.3831)  acc1: 91.4062 (91.4062)  acc5: 98.4375 (98.4375)  time: 3.9118 data: 3.6845  max mem: 14880
Test:  Total time: 0:00:25
Test:  Acc@1 77.738 Acc@5 94.460
Test: EMA  [ 0/98]  eta: 0:05:15  loss: 1.3172 (1.3172)  acc1: 90.6250 (90.6250)  acc5: 99.2188 (99.2188)  time: 3.2230 data: 2.9882  max mem: 14880
Test: EMA Total time: 0:00:25
Test: EMA Acc@1 80.012 Acc@5 95.286

jdsgomes · 2022-08-05T12:17:48Z

torchrun --nproc_per_node=4 train.py --model swin_v2_t --epochs 300 --batch-size 128 --opt adamw --lr 0.001 --weight-decay 0.05 --norm-weight-decay 0.0  --bias-weight-decay 0.0 --transformer-embedding-decay 0.0 --lr-scheduler cosineannealinglr --lr-min 0.00001 --lr-warmup-method linear  --lr-warmup-epochs 20 --lr-warmup-decay 0.01 --amp --label-smoothing 0.1 --mixup-alpha 0.8 --clip-grad-norm 5.0 --cutmix-alpha 1.0 --random-erase 0.25 --interpolation bicubic --auto-augment ta_wide --model-ema --ra-sampler --ra-reps 4 --val-resize-size 256 --val-crop-size 256 --train-crop-size 256

Epoch: [221] Total time: 0:25:25
Test:   [ 0/98]  eta: 0:06:23  loss: 1.3831 (1.3831)  acc1: 91.4062 (91.4062)  acc5: 98.4375 (98.4375)  time: 3.9118 data: 3.6845  max mem: 14880
Test:  Total time: 0:00:25
Test:  Acc@1 77.738 Acc@5 94.460
Test: EMA  [ 0/98]  eta: 0:05:15  loss: 1.3172 (1.3172)  acc1: 90.6250 (90.6250)  acc5: 99.2188 (99.2188)  time: 3.2230 data: 2.9882  max mem: 14880
Test: EMA Total time: 0:00:25
Test: EMA Acc@1 80.012 Acc@5 95.286

Thank you for sharing the commands. Although you previously mention I missed that in v2 they used a different resolution size. I launched new jobs to train all the variants with the correct resolution size and seems are looking better now.

jdsgomes · 2022-08-09T12:19:33Z

@ain-soph thank you for your patience and great work! I managed to train all three variants and the results look good.

My plan is to update this PR in the next couple of ours with the model weights.

jdsgomes · 2022-08-09T13:01:22Z

Trainning commands:

# swin_v2_t
python -u run_with_submitit.py --timeout 3000 --ngpus 8 --nodes 1  --model swin_v2_t --epochs 300 \
--batch-size 128 --opt adamw --lr 0.001 --weight-decay 0.05 --norm-weight-decay 0.0  \
--bias-weight-decay 0.0 --transformer-embedding-decay 0.0 --lr-scheduler cosineannealinglr \
--lr-min 0.00001 --lr-warmup-method linear  --lr-warmup-epochs 20 --lr-warmup-decay 0.01 \
--amp --label-smoothing 0.1 --mixup-alpha 0.8 --clip-grad-norm 5.0 --cutmix-alpha 1.0 \
--random-erase 0.25 --interpolation bicubic --auto-augment ta_wide --model-ema --ra-sampler \
--ra-reps 4  --val-resize-size 256 --val-crop-size 256 --train-crop-size 256 

# swin_v2_s
python -u run_with_submitit.py --timeout 3000 --ngpus 8 --nodes 1  --model swin_v2_s --epochs 300 \
--batch-size 128 --opt adamw --lr 0.001 --weight-decay 0.05 --norm-weight-decay 0.0  \
--bias-weight-decay 0.0 --transformer-embedding-decay 0.0 --lr-scheduler cosineannealinglr \
--lr-min 0.00001 --lr-warmup-method linear  --lr-warmup-epochs 20 --lr-warmup-decay 0.01 --amp --label-smoothing 0.1 --mixup-alpha 0.8 --clip-grad-norm 5.0 --cutmix-alpha 1.0 \
--random-erase 0.25 --interpolation bicubic --auto-augment ta_wide --model-ema --ra-sampler \
--ra-reps 4  --val-resize-size 256 --val-crop-size 256 --train-crop-size 256

# swin_v2_b
python -u run_with_submitit.py --timeout 3000 --ngpus 8 --nodes 1  --model swin_v2_b --epochs 300 \
--batch-size 128 --opt adamw --lr 0.001 --weight-decay 0.05 --norm-weight-decay 0.0  \
--bias-weight-decay 0.0 --transformer-embedding-decay 0.0 --lr-scheduler cosineannealinglr \
--lr-min 0.00001 --lr-warmup-method linear  --lr-warmup-epochs 20 --lr-warmup-decay 0.01 \
--amp --label-smoothing 0.1 --mixup-alpha 0.8 --clip-grad-norm 5.0 --cutmix-alpha 1.0 \
--random-erase 0.25 --interpolation bicubic --auto-augment ta_wide --model-ema --ra-sampler \
--ra-reps 4  --val-resize-size 256 --val-crop-size 256 --train-crop-size 256

Test commands and acuracies

# swin_v2_t
srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py     \
    --model swin_v2_t --test-only --resume $EXPERIMENTS_PATH/44757/model_299.pth --interpolation bicubic  \
    --val-resize-size 260 --val-crop-size 256
# Test:  Acc@1 82.072 Acc@5 96.132

# swin_v2_s
srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py     \
    --model swin_v2_s --test-only --resume $EXPERIMENTS_PATH/44758/model_299.pth --interpolation bicubic   \
    --val-resize-size $260--val-crop-size 256
# Test:  Acc@1 83.712 Acc@5 96.816

# swin_v2_b
srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py     \
    --model swin_v2_b --test-only --resume $EXPERIMENTS_PATH/44759/model_299.pth --interpolation bicubic   \
    --val-resize-size 272 --val-crop-size 256
# Test:  Acc@1 84.112 Acc@5 96.864

ain-soph · 2022-08-09T14:14:49Z

I guess the only final thing on our plate is to fix the memory issue. I’ll work on it in the following week.

Btw, should we provide porting model weights from official repo as an alternative?

datumbox · 2022-08-09T16:34:47Z

@ain-soph @jdsgomes Awesome work! Having SwinV2 in TorchVision is really awesome. Looking forward using them.

Btw, should we provide porting model weights from official repo as an alternative?

I think given we were able to reproduce the accuracy of the paper, there is no point offering both. What would be interesting on the future is to offer higher accuracy weights by using newer training recipes.

jdsgomes

Based on Yosua previous review of the ML side and mine replication of the results approving. Thanks @ain-soph for you patience and great work.

As soon as the tests are green we we are good to merge

datumbox

I reviewed @jdsgomes 5 last commits and everything LGTM. I would be OK to merge on green CI (if the memory issue persists on the CI, we might need to turn off some of the tests).

Summary: * init submit * fix typo * support ufmt and mypy * fix 2 unittest errors * fix ufmt issue * Apply suggestions from code review * unify codes * fix meshgrid indexing * fix a bug * fix type check * add type_annotation * add slow model * fix device issue * fix ufmt issue * add expect pickle file * fix jit script issue * fix type check * keep consistent argument order * add support for pretrained_window_size * avoid code duplication * a better code reuse * update window_size argument * make permute and flatten operations modular * add PatchMergingV2 * modify expect.pkl * use None as default argument value * fix type check * fix indent * fix window_size (temporarily) * remove "v2_" related prefix and add v2 builder * remove v2 builder * keep default value consistent with official repo * deprecate dropout * deprecate pretrained_window_size * fix dynamic padding edge case * remove unused imports * remove doc modification * Revert "deprecate dropout" This reverts commit 8a13f93. * Revert "fix dynamic padding edge case" This reverts commit 1c7579c. * remove unused kwargs * add downsample docs * revert block default value * revert argument order change * explicitly specify start_dim * add small and base variants * add expect files and slow_models * Add model weights and documentation for swin v2 * fix lint * fix end of files line Reviewed By: datumbox Differential Revision: D38824237 fbshipit-source-id: d94082c210c26665e70bdf8967ef72cbe3ed4a8a Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Joao Gomes <jdsgomes@fb.com>

ain-soph added 2 commits July 7, 2022 10:38

init submit

6eeecf9

Merge branch 'pytorch:main' into swin_transfomer_v2

a104e79

facebook-github-bot added the cla signed label Jul 7, 2022

ain-soph added 4 commits July 7, 2022 10:46

fix typo

7ad94a7

Merge branch 'swin_transfomer_v2' of https://github.com/ain-soph/vision…

4f1a59b

… into swin_transfomer_v2

support ufmt and mypy

e28ec45

fix 2 unittest errors

ff44832

fix ufmt issue

7d84b31

This was referenced Jul 8, 2022

[RFC] Batteries Included - Phase 2 #5410

Closed

Are new models planned to be added? #2707

Open

datumbox added module: models new feature labels Jul 8, 2022

datumbox reviewed Jul 8, 2022

View reviewed changes

ain-soph and others added 2 commits July 8, 2022 09:33

Apply suggestions from code review

4a21e98

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Merge branch 'swin_transfomer_v2' of https://github.com/ain-soph/vision…

6e488d5

… into swin_transfomer_v2

YosuaMichael mentioned this pull request Jul 8, 2022

Small Patch SwinTransformer for FX compatibility #6252

Merged

YosuaMichael reviewed Jul 8, 2022

View reviewed changes

YosuaMichael mentioned this pull request Jul 8, 2022

TorchVision patch for PyTorch 1.12.1 minor release pytorch/pytorch#81118

Closed

ain-soph added 5 commits July 8, 2022 13:06

unify codes

8e0f8f6

fix meshgrid indexing

284ca50

fix a bug

e801222

Merge branch 'pytorch:main' into swin_transfomer_v2

32bee48

Merge branch 'swin_transfomer_v2' of https://github.com/ain-soph/vision…

56cb30b

… into swin_transfomer_v2

ain-soph commented Jul 8, 2022

View reviewed changes

torchvision/models/swin_transformer.py Show resolved Hide resolved

ain-soph commented Jul 8, 2022

View reviewed changes

fix type check

eb06414

datumbox mentioned this pull request Jul 27, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

Merge branch 'main' into swin_transfomer_v2

af2f491

jdsgomes added 2 commits August 8, 2022 23:20

Merge branch 'main' into swin_transfomer_v2

1266a4c

Merge branch 'main' into swin_transfomer_v2

3fd6968

jdsgomes added 3 commits August 10, 2022 01:03

Add model weights and documentation for swin v2

72cd9d8

fix lint

68ffa1c

fix end of files line

73373a5

jdsgomes approved these changes Aug 10, 2022

View reviewed changes

jdsgomes changed the title ~~[WIP] Add SwinV2~~ Add SwinV2 Aug 10, 2022

datumbox approved these changes Aug 10, 2022

View reviewed changes

Merge branch 'main' into swin_transfomer_v2

c13c5ef

jdsgomes merged commit 5521e9d into pytorch:main Aug 10, 2022

ain-soph deleted the swin_transfomer_v2 branch August 10, 2022 21:39

ain-soph restored the swin_transfomer_v2 branch August 10, 2022 21:49

ain-soph deleted the swin_transfomer_v2 branch August 10, 2022 21:50

abhi-glitchhg mentioned this pull request Sep 10, 2022

Cannot import the latest swin transformer #6557

Closed

ain-soph mentioned this pull request Oct 10, 2022

做mask时候不用划分成9份吧， 4份就可以？附验证代码 microsoft/Swin-Transformer#194

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SwinV2 #6246

Add SwinV2 #6246

ain-soph commented Jul 7, 2022 •

edited by datumbox

Loading

ain-soph commented Jul 7, 2022

ain-soph commented Jul 8, 2022 •

edited

Loading

datumbox commented Jul 8, 2022 •

edited

Loading

datumbox left a comment •

edited

Loading

YosuaMichael commented Jul 8, 2022

YosuaMichael left a comment

ain-soph Jul 8, 2022 •

edited

Loading

ain-soph commented Aug 4, 2022 •

edited

Loading

datumbox commented Aug 4, 2022

datumbox commented Aug 4, 2022

ain-soph commented Aug 5, 2022

jdsgomes commented Aug 5, 2022

jdsgomes commented Aug 9, 2022

jdsgomes commented Aug 9, 2022

ain-soph commented Aug 9, 2022

datumbox commented Aug 9, 2022

jdsgomes left a comment

datumbox left a comment

Add SwinV2 #6246

Add SwinV2 #6246

Conversation

ain-soph commented Jul 7, 2022 • edited by datumbox Loading

ain-soph commented Jul 7, 2022

ain-soph commented Jul 8, 2022 • edited Loading

datumbox commented Jul 8, 2022 • edited Loading

datumbox left a comment • edited Loading

Choose a reason for hiding this comment

YosuaMichael commented Jul 8, 2022

YosuaMichael left a comment

Choose a reason for hiding this comment

ain-soph Jul 8, 2022 • edited Loading

Choose a reason for hiding this comment

ain-soph commented Aug 4, 2022 • edited Loading

datumbox commented Aug 4, 2022

datumbox commented Aug 4, 2022

ain-soph commented Aug 5, 2022

jdsgomes commented Aug 5, 2022

jdsgomes commented Aug 9, 2022

jdsgomes commented Aug 9, 2022

ain-soph commented Aug 9, 2022

datumbox commented Aug 9, 2022

jdsgomes left a comment

Choose a reason for hiding this comment

datumbox left a comment

Choose a reason for hiding this comment

ain-soph commented Jul 7, 2022 •

edited by datumbox

Loading

ain-soph commented Jul 8, 2022 •

edited

Loading

datumbox commented Jul 8, 2022 •

edited

Loading

datumbox left a comment •

edited

Loading

ain-soph Jul 8, 2022 •

edited

Loading

ain-soph commented Aug 4, 2022 •

edited

Loading