-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SmoothQuant offload bug #978
Conversation
Signed-off-by: Dipika <dipikasikka1@gmail.com>
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. |
@dsikka Apparently pytorch does not throw errors for inplace operations on meta-device tensors. a = torch.rand(10, device="meta")
# tensor(..., device='meta', size=(10,))
a.div_(6)
# no error
# tensor(..., device='meta', size=(10,)) Since the |
Yeah same behaviour I saw. I'll update it for the actual case when smoothing is applied as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grepped for .weight
, lgtm!
Once accelerate utilities land, we can replace these with align_module_device
* fix offload Signed-off-by: Dipika <dipikasikka1@gmail.com> * fix smoothquant offload bug * remove logtime --------- Signed-off-by: Dipika <dipikasikka1@gmail.com>
* fix offload Signed-off-by: Dipika <dipikasikka1@gmail.com> * fix smoothquant offload bug * remove logtime --------- Signed-off-by: Dipika <dipikasikka1@gmail.com>
* fix offload Signed-off-by: Dipika <dipikasikka1@gmail.com> * fix smoothquant offload bug * remove logtime --------- Signed-off-by: Dipika <dipikasikka1@gmail.com>
SUMMARY:
We may actually not need this change?
The issue is because of the following line:
i.e
y/x
However, we use
y.div_(x)
when actually applying the smoothing scales, which does it in place and does not cause an issue.Not 100% sure why one is ok and one is not