Skip to content

GPTQModel v1.4.1

Compare
Choose a tag to compare
@Qubitium Qubitium released this 13 Dec 16:52
· 121 commits to main since this release
11ca9a1

What's Changed

⚡ Added Qwen2-VL model support.
mse quantization control exposed in QuantizeConfig
⚡ New GPTQModel.patch_hf() and GPTQModel.patch_vllm() monkey patch api to allow Transformers/Optimum/Peft to use GPTQModel while upstream PRs are pending.
⚡ New GPTQModel.patch_vllm() monkey patch api to allow vLLM to correctly load dynamic/mixed gptq quantized models.

Full Changelog: v1.4.0...v1.4.1