GPTQ - move calibration of quantiztion params to after hessian calibration #25

bfineran · 2024-07-18T16:10:06Z

fixes a small bug in GPTQ implemetation where scales and zero points were calculated before hessian calibration passes. this likely is the cause of the small drift in results we notice in the implemetation

additionally this change is important for the activation reordering feature as delaying the update until after the hessian is computed allows us to set the scales and zero points easily wrt the permuted weight matrix

…ation

Satrat · 2024-07-18T17:57:36Z

src/llmcompressor/modifiers/quantization/gptq/base.py

+
+        # quantization scales and zp are already initialized but we do not
+        # want to calibrate wrt to these
+        self.model.apply(disable_quantization)
+


One thought here: in the case where we have activation quantization and GPTQ on the weights, we will end up running calibration twice and I don't think we need to anymore. Since we aren't using the scale/zp during the Hessian calculation maybe we should try and set them in the same calibration as the hessian.

Not a requirement for now but just something to think about for the future.

GPTQ - move calibration of quantiztion params to after hessian calibr…

52ef6be

…ation

bfineran requested review from Satrat and horheynm July 18, 2024 16:10

bfineran self-assigned this Jul 18, 2024

Satrat reviewed Jul 18, 2024

View reviewed changes

horheynm approved these changes Jul 18, 2024

View reviewed changes

bfineran merged commit 476d1eb into main Jul 22, 2024
8 of 12 checks passed

bfineran deleted the gptq-delay-quantization branch July 22, 2024 13:31

markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024

fix style post rename PR (vllm-project#25)

fd9545d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ - move calibration of quantiztion params to after hessian calibration #25

GPTQ - move calibration of quantiztion params to after hessian calibration #25

bfineran commented Jul 18, 2024

Satrat Jul 18, 2024

GPTQ - move calibration of quantiztion params to after hessian calibration #25

GPTQ - move calibration of quantiztion params to after hessian calibration #25

Conversation

bfineran commented Jul 18, 2024

Satrat Jul 18, 2024

Choose a reason for hiding this comment