Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technical details about quantization #1694

Closed
Tunaaaaa opened this issue Jun 5, 2023 · 1 comment
Closed

Technical details about quantization #1694

Tunaaaaa opened this issue Jun 5, 2023 · 1 comment
Labels

Comments

@Tunaaaaa
Copy link

Tunaaaaa commented Jun 5, 2023

Hi ggerganov,

Good morning!

I'm confused about the details of "q4_0", "q4_1", "GPTQ".
I've read the code of ./examples/quantize/quantize.cpp and found both q4_0 and q4_1 did't use the tech about "Hessian". It's more like just use min/max to do quantization.
However, at the end of ,, the youtuber video describes a method using H^-1 for optimization.
So I'm wondering what's the difference between "q4_0", "q4_1", "GPTQ" and that youtube vedio?

Thanks a lot.
Cheers,

TTTuna

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant