You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm confused about the details of "q4_0", "q4_1", "GPTQ".
I've read the code of ./examples/quantize/quantize.cpp and found both q4_0 and q4_1 did't use the tech about "Hessian". It's more like just use min/max to do quantization.
However, at the end of ,, the youtuber video describes a method using H^-1 for optimization.
So I'm wondering what's the difference between "q4_0", "q4_1", "GPTQ" and that youtube vedio?
Thanks a lot.
Cheers,
TTTuna
The text was updated successfully, but these errors were encountered:
Hi ggerganov,
Good morning!
I'm confused about the details of "q4_0", "q4_1", "GPTQ".
I've read the code of ./examples/quantize/quantize.cpp and found both q4_0 and q4_1 did't use the tech about "Hessian". It's more like just use min/max to do quantization.
However, at the end of ,, the youtuber video describes a method using H^-1 for optimization.
So I'm wondering what's the difference between "q4_0", "q4_1", "GPTQ" and that youtube vedio?
Thanks a lot.
Cheers,
TTTuna
The text was updated successfully, but these errors were encountered: