GPTQModel v1.0.5
What's Changed
Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.
- [MODEL] Add Llama 3.2 Vision (mllama)* support by @LRL-ModelCloud in #401
Full Changelog: v1.0.4...v1.0.5