Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

Open
BlackSamorez opened this issue Sep 4, 2024 · 3 comments
Open

[FEATURE REQUEST] Support for 6-bit and 8-bit codebooks #9

BlackSamorez opened this issue Sep 4, 2024 · 3 comments

Comments

@BlackSamorez
Copy link
Contributor

Hi!

Would that be possible to support 6-bit and 8-bit codebooks, or are there any hard limitations on codebook size that wouldn't allow it?

@HanGuo97
Copy link
Owner

HanGuo97 commented Sep 6, 2024

Hi, thanks for the suggestion!

I don't think there is any real blocker to support them apart from a bit of time + coding.

That being said, from what I understood, at 6-bit and 8-bit, simpler quantization algorithms (such as integer) quantization performs pretty good, and the need for codebook is relatively lower. This was the reason they weren't supported initially. Happy to hear your thoughts though!

@BlackSamorez
Copy link
Contributor Author

In short, we're conducting research on dynamic bitwidth quantization (per-layer bitwidth and such) and we're trying to come up with some sort of unified theory for a certain class of codebook quantization.
We need higher bitwidth support because some layers really need ~6bpw even when the average is around 3bpw.
And having a unified entry point for efficient inference of said grids could be extremely handy.

@HanGuo97
Copy link
Owner

HanGuo97 commented Sep 9, 2024

Ah got it, thanks for the context!

On a related question, Dan and I have been discussing codebook/vector quantization. FLUTE doesn't support this out of the box, but we have a vectorized LUT operation loosely similar to what you might be looking into. Let me know if this is something you need clarification etc too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants