Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SmoothQuant offload bug #978

Merged
merged 5 commits into from
Dec 18, 2024
Merged

Fix SmoothQuant offload bug #978

merged 5 commits into from
Dec 18, 2024

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented Dec 15, 2024

SUMMARY:

  • Offload/onload properly during SmoothQuant calibration

We may actually not need this change?
The issue is because of the following line:

scales = activation_scales.pow(self.smoothing_strength) / weight_scales.pow(
            1 - self.smoothing_strength
        )

i.e y/x

However, we use y.div_(x) when actually applying the smoothing scales, which does it in place and does not cause an issue.

Not 100% sure why one is ok and one is not

Signed-off-by: Dipika <dipikasikka1@gmail.com>
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

@dsikka dsikka marked this pull request as draft December 15, 2024 18:24
@dsikka dsikka requested a review from kylesayrs December 16, 2024 16:38
@kylesayrs
Copy link
Collaborator

@dsikka Apparently pytorch does not throw errors for inplace operations on meta-device tensors.

a = torch.rand(10, device="meta")
# tensor(..., device='meta', size=(10,))

a.div_(6)
# no error
# tensor(..., device='meta', size=(10,))

Since the _apply_smoothing doesn't take this into account, it's likely that this function silently fails on offloaded weights

@dsikka
Copy link
Collaborator Author

dsikka commented Dec 16, 2024

@dsikka Apparently pytorch does not throw errors for inplace operations on meta-device tensors.

a = torch.rand(10, device="meta")
# tensor(..., device='meta', size=(10,))

a.div_(6)
# no error
# tensor(..., device='meta', size=(10,))

Since the _apply_smoothing doesn't take this into account, it's likely that this function silently fails on offloaded weights

Yeah same behaviour I saw. I'll update it for the actual case when smoothing is applied as well.

@dsikka dsikka marked this pull request as ready for review December 18, 2024 17:38
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grepped for .weight, lgtm!
Once accelerate utilities land, we can replace these with align_module_device

@dsikka dsikka merged commit 8ea9617 into main Dec 18, 2024
6 of 7 checks passed
@dsikka dsikka deleted the fix_smoothquant branch December 18, 2024 19:42
horheynm pushed a commit that referenced this pull request Dec 20, 2024
* fix offload

Signed-off-by: Dipika <dipikasikka1@gmail.com>

* fix smoothquant offload bug

* remove logtime

---------

Signed-off-by: Dipika <dipikasikka1@gmail.com>
horheynm pushed a commit that referenced this pull request Dec 20, 2024
* fix offload

Signed-off-by: Dipika <dipikasikka1@gmail.com>

* fix smoothquant offload bug

* remove logtime

---------

Signed-off-by: Dipika <dipikasikka1@gmail.com>
horheynm pushed a commit that referenced this pull request Dec 20, 2024
* fix offload

Signed-off-by: Dipika <dipikasikka1@gmail.com>

* fix smoothquant offload bug

* remove logtime

---------

Signed-off-by: Dipika <dipikasikka1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants