[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

BlackSamorez · 2024-09-04T14:09:11Z

Hi!

Would that be possible to support 6-bit and 8-bit codebooks, or are there any hard limitations on codebook size that wouldn't allow it?

HanGuo97 · 2024-09-06T14:30:41Z

Hi, thanks for the suggestion!

I don't think there is any real blocker to support them apart from a bit of time + coding.

That being said, from what I understood, at 6-bit and 8-bit, simpler quantization algorithms (such as integer) quantization performs pretty good, and the need for codebook is relatively lower. This was the reason they weren't supported initially. Happy to hear your thoughts though!

BlackSamorez · 2024-09-09T08:24:38Z

In short, we're conducting research on dynamic bitwidth quantization (per-layer bitwidth and such) and we're trying to come up with some sort of unified theory for a certain class of codebook quantization.
We need higher bitwidth support because some layers really need ~6bpw even when the average is around 3bpw.
And having a unified entry point for efficient inference of said grids could be extremely handy.

HanGuo97 · 2024-09-09T16:45:52Z

Ah got it, thanks for the context!

On a related question, Dan and I have been discussing codebook/vector quantization. FLUTE doesn't support this out of the box, but we have a vectorized LUT operation loosely similar to what you might be looking into. Let me know if this is something you need clarification etc too!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

BlackSamorez commented Sep 4, 2024

HanGuo97 commented Sep 6, 2024

BlackSamorez commented Sep 9, 2024

HanGuo97 commented Sep 9, 2024

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

Comments

BlackSamorez commented Sep 4, 2024

HanGuo97 commented Sep 6, 2024

BlackSamorez commented Sep 9, 2024

HanGuo97 commented Sep 9, 2024