Phi-3 had its config.json changed by llm-compressor #81

Lin-K76 · 2024-08-12T16:20:07Z

Describe the bug
A clear and concise description of what the bug is.

Using llm-compressor to quantize Phi-3-medium-128k-instruct changes the config.json. Specifically, "rope_scaling" is changed from "type": "su" to "type": "longrope". This is not a problem with Phi-3-mini-128k-instruct for some reason (although the type is also changed), but for Phi-3-medium, this prevents evaluation of the model through lm eval harness.

To fix this issue, simply change the "type" in config back to "su".

The recipe I use is below:

recipe = """
quant_stage:
    quant_modifiers:
        QuantizationModifier:
            ignore: ["lm_head"]
            config_groups:
                group_0:
                    weights:
                        num_bits: 8
                        type: float
                        strategy: channel
                        dynamic: false
                        symmetric: true
                    input_activations:
                        num_bits: 8
                        type: float
                        strategy: token
                        dynamic: true
                        symmetric: true
                    targets: ["Linear"]
"""

Expected behavior

Successful evaluation through lm eval harness.

Environment
Include all relevant environment information:

OS [e.g. Ubuntu 18.04]: Ubuntu 22.04.4
Python version [e.g. 3.7]: 3.11.9
LLM Compressor version or commit hash [e.g. 0.1.0, f7245c8]: v0.1.0
ML framework version(s) [e.g. torch 1.7.1]: torch 2.4.0, transformers 4.40.2
Other Python package versions [e.g. SparseZoo, DeepSparse, numpy, ONNX]: lm_eval 0.4.3
Other relevant environment information [e.g. hardware, CUDA version]: CUDA 12.5

To Reproduce
Exact steps to reproduce the behavior:

Install lm eval harness, vllm, and llm-compressor. Use the recipe above (https://github.com/vllm-project/llm-compressor/blob/main/examples/big_model_offloading/big_model_fp8.py) to quantize Phi-3-medium-128k-instruct (https://huggingface.co/microsoft/Phi-3-medium-128k-instruct).

Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.

Additional context
Add any other context about the problem here. Also include any relevant files.

Tagging @robertgshaw2-neuralmagic

The text was updated successfully, but these errors were encountered:

robertgshaw2-neuralmagic · 2024-08-12T16:22:02Z

@horheynm can you take a look at this?

* fix * set default trust_remote_code to False * compatible w recent changes * update multi gpu code * lint

robertgshaw2-neuralmagic · 2024-08-22T16:40:34Z

@horheynm can this be closed?

* fix serialization * unit test fix

Lin-K76 added the bug Something isn't working label Aug 12, 2024

horheynm mentioned this issue Aug 14, 2024

Fix for issue #81 #84

Merged

horheynm added a commit that referenced this issue Aug 20, 2024

Fix for issue #81 (#84)

0155a5c

* fix * set default trust_remote_code to False * compatible w recent changes * update multi gpu code * lint

robertgshaw2-neuralmagic closed this as completed Aug 29, 2024

markmc pushed a commit to markmc/llm-compressor that referenced this issue Nov 13, 2024

Fix GPTQ Aliases (vllm-project#81)

313c072

* fix serialization * unit test fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi-3 had its config.json changed by llm-compressor #81

Phi-3 had its config.json changed by llm-compressor #81

Lin-K76 commented Aug 12, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Aug 12, 2024

robertgshaw2-neuralmagic commented Aug 22, 2024

Phi-3 had its config.json changed by llm-compressor #81

Phi-3 had its config.json changed by llm-compressor #81

Comments

Lin-K76 commented Aug 12, 2024 • edited Loading

robertgshaw2-neuralmagic commented Aug 12, 2024

robertgshaw2-neuralmagic commented Aug 22, 2024

Lin-K76 commented Aug 12, 2024 •

edited

Loading