-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unused cls_token in PatchEmbeddingBlock #3454
Comments
Thanks for raising the issue. Could you please help double confirm the issue? Thanks in advance. |
Hi @night-gale Thanks for your comments. If you utilize ViT for classification application only, then the classification flag needs to be activated. Doing so will enable the use of Originally, ViT is used as segmentation backbone for UNETR, hence the application needs to be specificed. Lastly, Thanks |
Hi! @ahatamiz I understand that the cls_token is an essential component of ViT and can be toggled off by passing classification as False. However, the redundant cls_token I found is in the PatchEmbeddingBlock. It is not reference in the forward method and cannot be turned off by passing argument. I currently removed the cls_token in my local copy of Monai and everything now works fine. It would be great if you could double check the implementation of PatchEmbeddingBlock. Thanks! |
Hi @night-gale Thanks for pointing out the issue. I see that there is an unused cls_token in here Thanks |
Hi @Nic-Ma Thanks for the efforts. I would be appreciate it if this can be addressed in future PRs. Thanks. |
Hi @ahatamiz , OK, sure, I will fix it in a PR soon. Thanks. |
Describe the bug
When I was training the
ViT
withtorch DistributedDataParallel
, during backward,torch
raises error and reports thatwhich means that the
cls_token
did not participate in the backward process.I checked the implementation of
ViT
andPatchEmbeddingBlock
and found the unusedcls_token
inmonai.networks.blocks.patchembedding.py: PatchEmbeddingBlock
.To Reproduce
Steps to reproduce the behavior:
TORCH_DISTRIBUTED_DEBUG=INFO
ViT
withtorch DistributedDataParallel
The text was updated successfully, but these errors were encountered: