I had the idea of smoothing outliers before quantization, apparently it's a VERY BAD idea #1707
KerfuffleV2
started this conversation in
Ideas
Replies: 1 comment
-
Another interesting thing is the outliers are mainly clustered in the same general area of the tensor:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Outliers are something quantization struggles with, so why not just clamp values... I don't know, more than 3 standard deviations above/below the mean to that value? Let's just say that's a fun way to see perplexity values over 12,000.
Turns out you have to increase that 20+ standard deviations before it doesn't just absolutely destroy the model.
base
there is requantizing a 7bq8_0
LLaMA toq6_k
with a hacked version of #1691 and no clamping of values, same thing for the others just giving it a bit of the old clamps.Here is some output running quantization while trying to clamp to 18 standard deviations. Some tensors don't get changed at all. The most is a couple hundred outlier values changed out of 10+ million but it has a huge effect.
Perhaps there's something wrong with my calculations? Relative to the pull I listed, right after joining the workers in
llama_convert_tensor_internal
I added:Is this idea just a complete dead end?
Beta Was this translation helpful? Give feedback.
All reactions