Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPTQ - move calibration of quantiztion params to after hessian calibration #25

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

bfineran
Copy link
Contributor

fixes a small bug in GPTQ implemetation where scales and zero points were calculated before hessian calibration passes. this likely is the cause of the small drift in results we notice in the implemetation

additionally this change is important for the activation reordering feature as delaying the update until after the hessian is computed allows us to set the scales and zero points easily wrt the permuted weight matrix

@bfineran bfineran requested review from Satrat and horheynm July 18, 2024 16:10
@bfineran bfineran self-assigned this Jul 18, 2024
Comment on lines +261 to +265

# quantization scales and zp are already initialized but we do not
# want to calibrate wrt to these
self.model.apply(disable_quantization)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought here: in the case where we have activation quantization and GPTQ on the weights, we will end up running calibration twice and I don't think we need to anymore. Since we aren't using the scale/zp during the Hessian calculation maybe we should try and set them in the same calibration as the hessian.

Not a requirement for now but just something to think about for the future.

@bfineran bfineran merged commit 476d1eb into main Jul 22, 2024
8 of 12 checks passed
@bfineran bfineran deleted the gptq-delay-quantization branch July 22, 2024 13:31
markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants