-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weight tying is broken on TPUs leading to silent errors #2705
Comments
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team! |
@zcain117, might be worth checking. It is a subtle issue and users are likely going to miss it. |
Most probably yes. It needs a small API change to support the |
@Borda will pick this up :) 👍 |
🐛 Bug
PyTorch/XLA documentation mentions here that weight tying should happen after moving tensors to XLA, otherwise the tensors are copied. This is a silent error that can easily go undetected (thanks to @matt-peters for pointing it out), and it would be good if PL guards the user against it. Notice that weight tying is pretty common in today's models not a corner case.
Code sample
The following code snippet shows how to detect that this issue is happening and how to guard against it.
The text was updated successfully, but these errors were encountered: