-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the huge vision transformer from SWAG #5721
Conversation
…dle case where number of param differ from default due to different image size input
…accidentally deleted)
Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
💊 CI failures summary and remediationsAs of commit e4062f4 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
As discussed offline, let's measure how much extra time this adds on our CI. We should also check if after adding it, our CI remains stable across runs. We've previously tried to add this version (see #5210) but then was reverted (see #5259) due to breakages at #5254. So if the tests pass now, we should be sure to understand that they are stable and won't break our CI again. |
@datumbox I have rerun the test 5x and all are successful, so this indicates that it seems stable now.
**Note: the additional time added on other platform usually less than the windows platform. In my opinion this addition is still reasonable and in fact it is quite similar with the existing model regnet_y_128gf in term of the test time. |
This is the validation script and final output:
|
@YosuaMichael Thanks for the analysis. Have you tried reducing the size of the input to speed up the execution? We do this for multiple models and even have a list that get their input reduced (you will need to update the expected file): Line 285 in 61f8266
Nevertheless, because ViT must be initialized for a specific size, you might need to add a record to the |
"vit_h_14": { | ||
"image_size": 56, | ||
"input_shape": (1, 3, 56, 56), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datumbox this according to your suggestions on changing the input image_size for the test on vit_h_14 models to speed up the test. The image_size need to be a multiple of patch_size which is 14, hence we use image_size of 56.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reducing the image_size, there is a speedup although not a lot. I observed that the speedup is around 1.5s - 2s for each gpu and cpu test of the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting. Does this mean that the majority of the time is spent of the model initialization or on the JIT-script parsing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datumbox I did a bit profiling locally and here are the results :
[2022-04-04 16:48:41.768663] Before building the model
[2022-04-04 16:48:45.000341] After building model
[2022-04-04 16:48:45.002153] After model.eval().to(device=dev)
[2022-04-04 16:48:45.207033] After doing model(x)
[2022-04-04 16:48:45.208375] After assert expected
[2022-04-04 16:48:45.208385] After assert shape num_classes
[2022-04-04 16:48:50.452526] After check_jit_scripttable
[2022-04-04 16:48:50.667378] After check_fx_compatible
[2022-04-04 16:48:51.256744] Finish
Seems like around 35% of the time is on building the model, and another 45% of the time is for check_jit_scriptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Summary: * Add vit_b_16_swag * Better handling idiom for image_size, edit test_extended_model to handle case where number of param differ from default due to different image size input * Update the accuracy to the experiment result on torchvision model * Fix typo missing underscore * raise exception instead of torch._assert, add back publication year (accidentally deleted) * Add license information on meta and readme * Improve wording and fix typo for pretrained model license in readme * Add vit_l_16 weight * Update README.rst * Update the accuracy meta on vit_l_16_swag model to result from our experiment * Add vit_h_14_swag model * Add accuracy from experiments * Add to vit_h_16 model to hubconf.py * Add docs and expected pkl file for test * Remove legacy compatibility for ViT_H_14 model * Test vit_h_14 with smaller image_size to speedup the test (Note: this ignores all push blocking failures!) Reviewed By: jdsgomes, NicolasHug Differential Revision: D36095649 fbshipit-source-id: 639dab0577088e18e1bcfa06fd1f01be20c3fd44 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
Part of #5708
However we separate the PR from #5714 because the huge models might break the CI.
The purpose of this PR is just to check whether or not the CI is okay with the huge model.