Adding the huge vision transformer from SWAG #5721

YosuaMichael · 2022-04-01T15:13:04Z

Part of #5708

However we separate the PR from #5714 because the huge models might break the CI.

The purpose of this PR is just to check whether or not the CI is okay with the huge model.

…dle case where number of param differ from default due to different image size input

…accidentally deleted)

…swag-weight

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

…periment

facebook-github-bot · 2022-04-01T15:13:13Z

💊 CI failures summary and remediations

As of commit e4062f4 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

torchvision/models/vision_transformer.py

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

datumbox · 2022-04-04T10:44:45Z

As discussed offline, let's measure how much extra time this adds on our CI. We should also check if after adding it, our CI remains stable across runs. We've previously tried to add this version (see #5210) but then was reverted (see #5259) due to breakages at #5254. So if the tests pass now, we should be sure to understand that they are stable and won't break our CI again.

YosuaMichael · 2022-04-04T15:12:36Z

As discussed offline, let's measure how much extra time this adds on our CI. We should also check if after adding it, our CI remains stable across runs. We've previously tried to add this version (see #5210) but then was reverted (see #5259) due to breakages at #5254. So if the tests pass now, we should be sure to understand that they are stable and won't break our CI again.

@datumbox I have rerun the test 5x and all are successful, so this indicates that it seems stable now.
I also take note of the additional time the model vit_h_14 introduced:

test_windows_cpu: add 70.95s
test_windows_gpu: add 27.27s
test_extended: add 10.86s

**Note: the additional time added on other platform usually less than the windows platform.

In my opinion this addition is still reasonable and in fact it is quite similar with the existing model regnet_y_128gf in term of the test time.
What do you think about this?

YosuaMichael · 2022-04-04T16:09:42Z

This is the validation script and final output:

python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model vit_h_14 --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="ViT_H_14_Weights.IMAGENET1K_SWAG_V1"
# Acc@1 88.552 Acc@5 98.694

datumbox · 2022-04-04T16:15:46Z

@YosuaMichael Thanks for the analysis. Have you tried reducing the size of the input to speed up the execution?

We do this for multiple models and even have a list that get their input reduced (you will need to update the expected file):

vision/test/test_models.py

Line 285 in 61f8266

slow_models = [

Nevertheless, because ViT must be initialized for a specific size, you might need to add a record to the _model_params directly. Could you check it out and let us know how much you can reduce the time by passing smaller input size?

YosuaMichael · 2022-04-04T17:02:21Z

test/test_models.py

+    "vit_h_14": {
+        "image_size": 56,
+        "input_shape": (1, 3, 56, 56),
+    },


@datumbox this according to your suggestions on changing the input image_size for the test on vit_h_14 models to speed up the test. The image_size need to be a multiple of patch_size which is 14, hence we use image_size of 56.

After reducing the image_size, there is a speedup although not a lot. I observed that the speedup is around 1.5s - 2s for each gpu and cpu test of the model.

Very interesting. Does this mean that the majority of the time is spent of the model initialization or on the JIT-script parsing?

@datumbox I did a bit profiling locally and here are the results :

[2022-04-04 16:48:41.768663] Before building the model [2022-04-04 16:48:45.000341] After building model [2022-04-04 16:48:45.002153] After model.eval().to(device=dev) [2022-04-04 16:48:45.207033] After doing model(x) [2022-04-04 16:48:45.208375] After assert expected [2022-04-04 16:48:45.208385] After assert shape num_classes [2022-04-04 16:48:50.452526] After check_jit_scripttable [2022-04-04 16:48:50.667378] After check_fx_compatible [2022-04-04 16:48:51.256744] Finish

Seems like around 35% of the time is on building the model, and another 45% of the time is for check_jit_scriptable.

datumbox

LGTM, thanks!

Summary: * Add vit_b_16_swag * Better handling idiom for image_size, edit test_extended_model to handle case where number of param differ from default due to different image size input * Update the accuracy to the experiment result on torchvision model * Fix typo missing underscore * raise exception instead of torch._assert, add back publication year (accidentally deleted) * Add license information on meta and readme * Improve wording and fix typo for pretrained model license in readme * Add vit_l_16 weight * Update README.rst * Update the accuracy meta on vit_l_16_swag model to result from our experiment * Add vit_h_14_swag model * Add accuracy from experiments * Add to vit_h_16 model to hubconf.py * Add docs and expected pkl file for test * Remove legacy compatibility for ViT_H_14 model * Test vit_h_14 with smaller image_size to speedup the test (Note: this ignores all push blocking failures!) Reviewed By: jdsgomes, NicolasHug Differential Revision: D36095649 fbshipit-source-id: 639dab0577088e18e1bcfa06fd1f01be20c3fd44 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

YosuaMichael and others added 13 commits March 31, 2022 14:46

Add vit_b_16_swag

c801bf0

Better handling idiom for image_size, edit test_extended_model to han…

9e13f79

…dle case where number of param differ from default due to different image size input

Update the accuracy to the experiment result on torchvision model

1707171

Fix typo missing underscore

bd8b1a8

raise exception instead of torch._assert, add back publication year (…

6c765a5

…accidentally deleted)

Merge branch 'main' into add-swag-weight

3326e88

Add license information on meta and readme

e444c5a

Merge branch 'add-swag-weight' of github.com:pytorch/vision into add-…

a6ee605

…swag-weight

Improve wording and fix typo for pretrained model license in readme

54aa8cf

Add vit_l_16 weight

f9c32eb

Update README.rst

4cf4eff

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Update the accuracy meta on vit_l_16_swag model to result from our ex…

9230f40

…periment

Add vit_h_14_swag model

ce6eb3e

facebook-github-bot added the cla signed label Apr 1, 2022

YosuaMichael changed the base branch from main to add-swag-weight April 1, 2022 15:17

YosuaMichael self-assigned this Apr 1, 2022

YosuaMichael added module: models new feature labels Apr 1, 2022

Add accuracy from experiments

ff76a53

Base automatically changed from add-swag-weight to main April 1, 2022 16:09

YosuaMichael and others added 4 commits April 1, 2022 17:32

Add to vit_h_16 model to hubconf.py

e874548

Add docs and expected pkl file for test

2ca4ac4

Merge branch 'main' into add-vit-swag-huge

c806fb1

Merge branch 'main' into add-vit-swag-huge

9ff5a76

datumbox reviewed Apr 1, 2022

View reviewed changes

torchvision/models/vision_transformer.py Outdated Show resolved Hide resolved

YosuaMichael and others added 2 commits April 4, 2022 10:36

Remove legacy compatibility for ViT_H_14 model

9f603d6

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Merge branch 'main' into add-vit-swag-huge

dd21912

Merge branch 'main' into add-vit-swag-huge

e4062f4

Test vit_h_14 with smaller image_size to speedup the test

02be296

YosuaMichael commented Apr 4, 2022

View reviewed changes

Merge branch 'main' into add-vit-swag-huge

87e6c2a

datumbox mentioned this pull request Apr 5, 2022

[RFC] Batteries Included - Phase 2 #5410

Closed

24 tasks

datumbox approved these changes Apr 5, 2022

View reviewed changes

Merge branch 'main' into add-vit-swag-huge

696201f

YosuaMichael merged commit 63576c9 into main Apr 5, 2022

datumbox deleted the add-vit-swag-huge branch April 5, 2022 15:41

datumbox linked an issue Apr 10, 2022 that may be closed by this pull request

Add the SWAG pre-trained weights in TorchVision #5708

Closed

datumbox mentioned this pull request Jul 27, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding the huge vision transformer from SWAG #5721

Adding the huge vision transformer from SWAG #5721

YosuaMichael commented Apr 1, 2022

facebook-github-bot commented Apr 1, 2022 •

edited

Loading

datumbox commented Apr 4, 2022

YosuaMichael commented Apr 4, 2022 •

edited

Loading

YosuaMichael commented Apr 4, 2022

datumbox commented Apr 4, 2022 •

edited

Loading

YosuaMichael Apr 4, 2022 •

edited

Loading

YosuaMichael Apr 5, 2022

datumbox Apr 5, 2022

YosuaMichael Apr 5, 2022 •

edited

Loading

datumbox left a comment

Adding the huge vision transformer from SWAG #5721

Adding the huge vision transformer from SWAG #5721

Conversation

YosuaMichael commented Apr 1, 2022

facebook-github-bot commented Apr 1, 2022 • edited Loading

💊 CI failures summary and remediations

datumbox commented Apr 4, 2022

YosuaMichael commented Apr 4, 2022 • edited Loading

YosuaMichael commented Apr 4, 2022

datumbox commented Apr 4, 2022 • edited Loading

YosuaMichael Apr 4, 2022 • edited Loading

Choose a reason for hiding this comment

YosuaMichael Apr 5, 2022

Choose a reason for hiding this comment

datumbox Apr 5, 2022

Choose a reason for hiding this comment

YosuaMichael Apr 5, 2022 • edited Loading

Choose a reason for hiding this comment

datumbox left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 1, 2022 •

edited

Loading

YosuaMichael commented Apr 4, 2022 •

edited

Loading

datumbox commented Apr 4, 2022 •

edited

Loading

YosuaMichael Apr 4, 2022 •

edited

Loading

YosuaMichael Apr 5, 2022 •

edited

Loading