Skip to content

GPTQModel v1.0.5

Compare
Choose a tag to compare
@Qubitium Qubitium released this 26 Sep 10:54
· 91 commits to main since this release
4921d68

What's Changed

Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.

Full Changelog: v1.0.4...v1.0.5