-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix swin backbone absolute pos_embed #8127
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #8127 +/- ##
=======================================
Coverage 64.17% 64.17%
=======================================
Files 361 361
Lines 29525 29529 +4
Branches 5019 5020 +1
=======================================
+ Hits 18947 18950 +3
- Misses 9575 9576 +1
Partials 1003 1003
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a unit test in |
This PR needs to be migrated to dev-3.x |
* Fix swin backbone absolute pos_embed resizing * fix lint * fix lint * add unit test * Update swin.py Co-authored-by: Cedric Luo <luochunhua1996@outlook.com>
* Fix swin backbone absolute pos_embed resizing * fix lint * fix lint * add unit test * Update swin.py Co-authored-by: Cedric Luo <luochunhua1996@outlook.com>
* Fix swin backbone absolute pos_embed resizing * fix lint * fix lint * add unit test * Update swin.py Co-authored-by: Cedric Luo <luochunhua1996@outlook.com>
Hi @normster !First of all, we want to express our gratitude for your significant PR in the mmdetection project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR. We would also like to invite you to join our Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us :https://discord.gg/raweFPmdzG If you have WeChat,welcome to join our community on WeChat. You can add our assistant :openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends:) |
Motivation
The Swin transformer backbone is currently unable to process input images differing in size from the pretraining resolution when absolute positional embedding is enabled, due to lack of dynamic pos embed resizing in the forward pass.
Modification
When absolute positional embedding is enabled, the backbone now checks the input shape against the pos embed shape and performs a bicubic interpolation to the correct shape if necessary. This is based on the original Mask2former code: https://github.com/facebookresearch/Mask2Former/blob/main/mask2former/modeling/backbone/swin.py#L656.
The pos embed parameter's shape has also been changed from [B * (H*W) * D] -> [B * D * H * W] to bring the model into consistency with the checkpoint loading code, which expects the new shape.
BC-breaking (Optional)
This fix should not break backwards compatibility. No existing configs using the Swin backbone have absolute positional embedding enabled, or load from
Use cases (Optional)
Checklist