Release GPTQModel v1.0.5 · ModelCloud/GPTQModel

What's Changed

Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.

[MODEL] Add Llama 3.2 Vision (mllama)* support by @LRL-ModelCloud in #401

Full Changelog: v1.0.4...v1.0.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v1.0.5

What's Changed

Contributors