You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of what the bug is.
Using llm-compressor to quantize Phi-3-medium-128k-instruct changes the config.json. Specifically, "rope_scaling" is changed from "type": "su" to "type": "longrope". This is not a problem with Phi-3-mini-128k-instruct for some reason (although the type is also changed), but for Phi-3-medium, this prevents evaluation of the model through lm eval harness.
To fix this issue, simply change the "type" in config back to "su".
Describe the bug
A clear and concise description of what the bug is.
Using llm-compressor to quantize Phi-3-medium-128k-instruct changes the config.json. Specifically, "rope_scaling" is changed from "type": "su" to "type": "longrope". This is not a problem with Phi-3-mini-128k-instruct for some reason (although the type is also changed), but for Phi-3-medium, this prevents evaluation of the model through lm eval harness.
To fix this issue, simply change the "type" in config back to "su".
The recipe I use is below:
Expected behavior
Successful evaluation through lm eval harness.
Environment
Include all relevant environment information:
To Reproduce
Exact steps to reproduce the behavior:
Install lm eval harness, vllm, and llm-compressor. Use the recipe above (https://github.com/vllm-project/llm-compressor/blob/main/examples/big_model_offloading/big_model_fp8.py) to quantize Phi-3-medium-128k-instruct (https://huggingface.co/microsoft/Phi-3-medium-128k-instruct).
Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
Additional context
Add any other context about the problem here. Also include any relevant files.
Tagging @robertgshaw2-neuralmagic
The text was updated successfully, but these errors were encountered: