You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to port the implementation as well as the weights.
Motivation, pitch
The idea is to first port SwinTransformer3dV1 and port its weights successfully. Once done we can then think of having SwinTransformer3dV2 (there is no such paper or implementation but maybe it will benefit like the 2d case)
Alternatives
No response
Additional context
Additionally in discussion with @YosuaMichael the paper also mentioned that SwinTransformerV2 can be used for object detection tasks. If possible we should explore it (but only after we finish previous things)
The text was updated successfully, but these errors were encountered:
Just a quick update. I have started working on this, (sadly my technical knowledge needed a bit of refresher (thnx to Java and other tech work)). I have read through the paper of ViT, SwinTransformer. Will go through the video variant over next 2 days, verify the implementation and open a PR.
🚀 The feature
The main Idea is to port the SwinTransformer3d model from torchmulitmodal to torchvision.
Need to keep in mind the nuances and code structure of torchvision
https://github.com/facebookresearch/multimodal/blob/main/torchmultimodal/modules/encoders/swin_transformer_3d_encoder.py
https://github.com/facebookresearch/multimodal/blob/main/examples/omnivore/LoadOriginalPretrainedWeightAndCompare.ipynb
We need to port the implementation as well as the weights.
Motivation, pitch
The idea is to first port SwinTransformer3dV1 and port its weights successfully. Once done we can then think of having SwinTransformer3dV2 (there is no such paper or implementation but maybe it will benefit like the 2d case)
Alternatives
No response
Additional context
Additionally in discussion with @YosuaMichael the paper also mentioned that SwinTransformerV2 can be used for object detection tasks. If possible we should explore it (but only after we finish previous things)
The text was updated successfully, but these errors were encountered: