MaxVit model #6342

TeodorPoncu · 2022-08-01T14:31:06Z

This PR is w.r.t. Batteries Phase 3 proposal to add the MaxVit architecture. It is still a work in progress as it has yet to be trained.

One caveat w.r.t. the way we would be exposing this model API to users is that the architecture is bound to the specific input size it was trained one (due to the usage of relative positional encodings)

Running the command: torchrun --nproc_per_node=1 train.py --test-only --prototype --weights MaxVit_T_Weights.IMAGENET1K_V1 --model maxvit_t -b 1 yields the following results:

Test: Acc@1 83.700 Acc@5 96.722

datumbox

@TeodorPoncu That was an unexpected surprise, thanks a lot for contributing this architecture.

I know that your PR is draft but I've added a few remarks mostly related to our coding styles and practices. I haven't verified the ML side of things. Feel free to ignore this is you thing my input is premature.

test/test_models.py

torchvision/models/maxvit.py

datumbox · 2022-08-05T18:25:47Z

@TeodorPoncu It seems that in a recent commit, you accidentally updated all the expected files for all models. Could you please revert that?

This reverts commit c5b2839.

TeodorPoncu · 2022-08-05T19:32:48Z

@datumbox Sorry about that, everything should be fine now.

@TeodorPoncu It seems that in a recent commit, you accidentally updated all the expected files for all models. Could you please revert that?

vadimkantorov · 2022-08-19T17:26:36Z

Related discussion and pointers on generalizing fixed resolution for Swin: #6227

Also, I wonder if more relative-attention related modules can be reused from Swin

vadimkantorov · 2022-08-19T17:27:41Z

torchvision/models/maxvit.py

+        self.register_buffer("relative_position_index", get_relative_position_index(self.size, self.size))
+
+        # initialize with truncated normal the bias
+        self.positional_bias.data.normal_(mean=0, std=0.02)


shouldn't https://pytorch.org/docs/stable/nn.init.html?highlight=nn%20init#torch.nn.init.normal_ be used here?

vadimkantorov · 2022-08-19T17:28:41Z

torchvision/models/maxvit.py

+        self.scale_factor = feat_dim**-0.5
+
+        self.merge = nn.Linear(self.head_dim * self.n_heads, feat_dim)
+        self.positional_bias = nn.parameter.Parameter(


if it's initizlized to normal just below, shouldn't torch.empty be used here?

vadimkantorov · 2022-08-19T17:29:25Z

torchvision/models/maxvit.py

+
+    def get_relative_positional_bias(self) -> torch.Tensor:
+        bias_index = self.relative_position_index.view(-1)  # type: ignore
+        relative_bias = self.positional_bias[bias_index].view(self.max_seq_len, self.max_seq_len, -1)  # type: ignore


if it's just for flattening some end dimensions .flatten(start_dim = 2) can be used here

vadimkantorov · 2022-08-19T17:31:20Z

torchvision/models/maxvit.py

+        self.b = b
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        res = torch.swapaxes(x, self.a, self.b)


swapaxes is NumPy-compat-dialect alias for torch.transpose (https://pytorch.org/docs/stable/generated/torch.swapaxes.html#torch.swapaxes). If the rest is in Torch-lingo, shouldn't it be (arg names to match https://pytorch.org/docs/stable/generated/torch.transpose.html#torch.transpose):

class SwapAxes(nn.Module): def __init__(self, dim0: int, dim1: int) -> None: super().__init__() self.dim0 = dim0 self.dim1 = dim1 def forward(self, x: torch.Tensor) -> torch.Tensor: return x.transpose(self.dim0, self.dim1)

datumbox

Thanks for the PR @TeodorPoncu. I've added some initial comments on the reference scripts updates. Happy to chat more.

references/classification/run_with_submitit.py

references/classification/train.py

references/classification/utils.py

references/classification/train.py

… compatibility

TeodorPoncu · 2022-09-21T15:48:29Z

Running the deployed weights with the following command:
torchrun --nproc_per_node=1 train.py --model maxvit_t --interpolation bicubic --batch-size 1 --test-only --weights MaxVit_T_Weights.IMAGENET1K_V1

Yields the following results:
Test: Acc@1 83.700 Acc@5 96.722

datumbox

@TeodorPoncu pretty awesome work and top quality code.

I've left a few Nits here and there, just to make sure that the implementation code aligns with the idioms of the rest of TorchVision. The only comment worth considering a bit more is the one concerning the number of parameters you will pass on the constructor. This comment is mostly to align MaxViT with all other models and make it easier on the future to make changes to all of them. In some cases in particular, you expose parameters that are indeed quite useful (num of input channels), but such changes on the API would be best if we introduce them in all models, not just MaxViT.

Other than the above, I didn't validate much the architecture part of things but focused mainly on idioms and code styles. I know that you've already reproduced the accuracy of the paper, which is great but it's worth doing one scan with @jdsgomes prior merging to confirm that there are no deviations from the original implementation (for example on padding of input whose spatial dimensions are not divisible by p).

All CI failures seems unrelated. Other than these final validations and nits, I think the implementation is in awesome shape and we should be able to merge it soon.

docs/source/models.rst

references/classification/presets.py

references/classification/train.py

datumbox · 2022-09-21T15:56:19Z

test/test_architecture_ops.py

+from torchvision.models.maxvit import SwapAxes, WindowDepartition, WindowPartition
+
+
+class MaxvitTester(unittest.TestCase):


I understand that here you are testing specific layers from MaxViT. This is not something we did previously, so perhaps it does need to be on a separate file.

@YosuaMichael any thoughts here?

Sorry for a pretty late response!
Currently I dont have any opinion how we should test specific layer of the model and I think this is okay. (Need more time to think and discuss whether we should do more of this kind of test or not)

torchvision/models/maxvit.py

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

datumbox · 2022-09-21T16:36:10Z

Two more requests:

Could you please upload the weights on manifold (see internal guide)
Could you update the PR description to show-case the output accuracy of the following command?

torchrun --nproc_per_node=1 train.py --test-only --prototype --weights MaxVit_T_Weights.IMAGENET1K_V1 --model maxvit_t -b 1

datumbox

Thanks for all the changes, few follow ups:

torchvision/models/maxvit.py

references/classification/train.py

torchvision/models/maxvit.py

…sion into BATERIES]-add-max-vit

torchvision/models/maxvit.py

jdsgomes

Did a final pass to review the ML side and looks good. Happy to approve after the pending changes requested are done. Really nice work

…sion into BATERIES]-add-max-vit

github-actions · 2022-09-23T12:42:21Z

Hey @TeodorPoncu!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

datumbox · 2022-09-25T15:31:28Z

references/classification/presets.py

+            [transforms.RandomResizedCrop(crop_size, interpolation=interpolation)]
+            if center_crop
+            else [transforms.CenterCrop(crop_size)]


@TeodorPoncu I think this is a bug. I believe you meant to write:

[transforms.CenterCrop(crop_size)] if center_crop else [transforms.RandomResizedCrop(crop_size, interpolation=interpolation)]

Could you please confirm?

Edit: I issued a fix at #6642

Yes, sorry for that. You've correctly guessed what I wanted to write. Thanks for catching it out. I think the --train-center-crop flag should be removed from the training command docs as well to reflect the way the weights were trained.

@TeodorPoncu thanks for coming back to me. Does this mean that you didn't actually use the flag during training? Can we remove it?

Yes, the flag can be removed.

That's not what I see on the training log of the trained model. I see that train_center_crop=True. Do we have the right model available on the checkpoint area on AWS?

Yes, but given the fix, in order to replicate one will have to run with train_center_crop=False in order to have the same preprocessing behavior during training as the AWS weights had.

OK so you suspect that this bug was introduced way early right? You don't happen to know more or less which githash you used to train this? I can have a look for you if you give me a rough estimation or band of githashes.

Yes, the bug was introduced and used when performing the training as in 1fddecc

Agreed. I checked all commits prior f561edf (date before training) and all of them use RandomCrop. I'll remove the flag.

Summary: * Added maxvit architecture and tests * rebased + addresed comments * Revert "rebased + addresed comments" This reverts commit c5b2839. * Re-added model changes after revert * aligned with partial original implementation * removed submitit script fixed lint * mypy fix for too many arguments * updated old tests * removed per batch lr scheduler and seed setting * removed ontap * added docs, validated weights * fixed test expect, moved shape assertions in the begging for torch.fx compatibility * mypy fix * lint fix * added legacy interface * added weight link * updated docs * Update references/classification/train.py * Update torchvision/models/maxvit.py * adressed comments * update ra_maginuted and augmix_severity default values * adressed some comments * remove input_channels parameter Reviewed By: YosuaMichael Differential Revision: D39885422 fbshipit-source-id: c51942974bf17f6473c3b3b08a4d16aad5812dc3 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Added maxvit architecture and tests

f15fd92

facebook-github-bot added the cla signed label Aug 1, 2022

datumbox mentioned this pull request Aug 2, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

datumbox reviewed Aug 2, 2022

View reviewed changes

rebased + addresed comments

c5b2839

TeodorPoncu added 2 commits August 5, 2022 20:07

Revert "rebased + addresed comments"

5e8a222

This reverts commit c5b2839.

Re-added model changes after revert

aa95139

vadimkantorov reviewed Aug 19, 2022

View reviewed changes

TeodorPoncu added 3 commits September 14, 2022 17:04

aligned with partial original implementation

1fddecc

removed submitit script fixed lint

b7f0e97

mypy fix for too many arguments

872f40f

datumbox reviewed Sep 14, 2022

View reviewed changes

TeodorPoncu added 9 commits September 14, 2022 19:24

updated old tests

f561edf

removed per batch lr scheduler and seed setting

314b82a

removed ontap

a4863e9

Merge branch 'main' into BATERIES]-add-max-vit

c4406e4

added docs, validated weights

2111680

fixed test expect, moved shape assertions in the begging for torch.fx…

cc51c2b

… compatibility

mypy fix

d2dfe71

lint fix

328f9b6

added legacy interface

b334b7f

datumbox mentioned this pull request Sep 20, 2022

[RFC] U-Net framework #6610

Open

added weight link

ebb8c16

TeodorPoncu changed the title ~~[WIP] MaxVit model~~ MaxVit model Sep 21, 2022

Merge branch 'main' into BATERIES]-add-max-vit

775990c

datumbox reviewed Sep 21, 2022

View reviewed changes

TeodorPoncu and others added 2 commits September 21, 2022 17:26

Update references/classification/train.py

a24e549

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Update torchvision/models/maxvit.py

bb42548

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

TeodorPoncu added 2 commits September 21, 2022 18:23

adressed comments

ed21d3d

Merge branch 'main' into BATERIES]-add-max-vit

09e4ced

datumbox reviewed Sep 22, 2022

View reviewed changes

jdsgomes reviewed Sep 22, 2022

View reviewed changes

torchvision/models/maxvit.py Show resolved Hide resolved

TeodorPoncu added 2 commits September 22, 2022 14:56

update ra_maginuted and augmix_severity default values

521d6d5

Merge branch 'BATERIES]-add-max-vit' of https://github.com/pytorch/vi…

79cb004

…sion into BATERIES]-add-max-vit

jdsgomes reviewed Sep 22, 2022

View reviewed changes

torchvision/models/maxvit.py Show resolved Hide resolved

jdsgomes approved these changes Sep 22, 2022

View reviewed changes

TeodorPoncu added 6 commits September 22, 2022 15:06

adressed some comments

97cbcd8

Merge branch 'BATERIES]-add-max-vit' of https://github.com/pytorch/vi…

9fc6a5b

…sion into BATERIES]-add-max-vit

remove input_channels parameter

6b00ca8

Merge branch 'main' into BATERIES]-add-max-vit

45d3966

Merge branch 'main' into BATERIES]-add-max-vit

2aca920

Merge branch 'main' into BATERIES]-add-max-vit

cab35c1

TeodorPoncu merged commit 6b1646c into main Sep 23, 2022

TeodorPoncu added module: models new feature labels Sep 23, 2022

datumbox deleted the BATERIES]-add-max-vit branch September 23, 2022 14:56

datumbox reviewed Sep 25, 2022

View reviewed changes

TeodorPoncu mentioned this pull request Feb 5, 2024

MaxVIT Model - BatchNorm momentum is incorrect #8250

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaxVit model #6342

MaxVit model #6342

TeodorPoncu commented Aug 1, 2022 •

edited

Loading

datumbox left a comment

datumbox commented Aug 5, 2022

TeodorPoncu commented Aug 5, 2022

vadimkantorov commented Aug 19, 2022 •

edited

Loading

vadimkantorov Aug 19, 2022

vadimkantorov Aug 19, 2022

vadimkantorov Aug 19, 2022

vadimkantorov Aug 19, 2022 •

edited

Loading

datumbox left a comment

TeodorPoncu commented Sep 21, 2022

datumbox left a comment

datumbox Sep 21, 2022

YosuaMichael Sep 23, 2022

datumbox commented Sep 21, 2022

datumbox left a comment

jdsgomes left a comment

github-actions bot commented Sep 23, 2022

datumbox Sep 25, 2022 •

edited

Loading

TeodorPoncu Sep 26, 2022 •

edited

Loading

datumbox Sep 26, 2022

TeodorPoncu Sep 26, 2022

datumbox Sep 26, 2022 •

edited

Loading

TeodorPoncu Sep 26, 2022 •

edited

Loading

datumbox Sep 26, 2022

TeodorPoncu Sep 26, 2022

datumbox Sep 26, 2022

		from torchvision.models.maxvit import SwapAxes, WindowDepartition, WindowPartition


		class MaxvitTester(unittest.TestCase):

MaxVit model #6342

MaxVit model #6342

Conversation

TeodorPoncu commented Aug 1, 2022 • edited Loading

datumbox left a comment

Choose a reason for hiding this comment

datumbox commented Aug 5, 2022

TeodorPoncu commented Aug 5, 2022

vadimkantorov commented Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadimkantorov Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

datumbox left a comment

Choose a reason for hiding this comment

TeodorPoncu commented Sep 21, 2022

datumbox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox commented Sep 21, 2022

datumbox left a comment

Choose a reason for hiding this comment

jdsgomes left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 23, 2022

datumbox Sep 25, 2022 • edited Loading

Choose a reason for hiding this comment

TeodorPoncu Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

TeodorPoncu Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TeodorPoncu commented Aug 1, 2022 •

edited

Loading

vadimkantorov commented Aug 19, 2022 •

edited

Loading

vadimkantorov Aug 19, 2022 •

edited

Loading

datumbox Sep 25, 2022 •

edited

Loading

TeodorPoncu Sep 26, 2022 •

edited

Loading

datumbox Sep 26, 2022 •

edited

Loading

TeodorPoncu Sep 26, 2022 •

edited

Loading