Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New method] VPTQ Vector Post-Training Quantization Support #1204

Open
YangWang92 opened this issue Oct 31, 2024 · 2 comments
Open

[New method] VPTQ Vector Post-Training Quantization Support #1204

YangWang92 opened this issue Oct 31, 2024 · 2 comments

Comments

@YangWang92
Copy link

Hi all,

We've recently open-sourced a new quantization method. VPTQ (Vector Post-Training Quantization) is a novel Post-Training Quantization method that leverages Vector Quantization to achieve high accuracy on Large Language Models (LLMs) at an extremely low bit-width (<2-bit). VPTQ can compress 70 billion parameter models, even up to the 405 billion parameter models, to 1-2 bits without retraining while maintaining high accuracy.

For more details, you can check the code and documentation here: https://github.com/microsoft/VPTQ

And the model can be accessed here: https://huggingface.co/VPTQ-community

I am currently attempting to integrate VPTQ into AO. Does anyone have suggestions or best practices for this kind of integration? What should I be particularly aware of?

Thanks!
Yang

@msaroufim
Copy link
Member

On phone so apologies for brevity. This seems awesome and we've been trying to start pushing boundaries with less than 4 bit quantization

You can checkout #391 and refer to some of the more recent contributions like auto round as an example of what to do.

@jerryzh168 is also hoping to simplify some of this work as well

@jerryzh168
Copy link
Contributor

#1184 and #1195 might be helpful as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants