Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ScalarQuantizer to use full bucket range #4074

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ddrcoder
Copy link
Contributor

@ddrcoder ddrcoder commented Dec 7, 2024

Summary:
Scalar quantizers have an off-by-one bug. Even when the quantizer is trained to cover the full trained data range, it only ends up encoding the maximum quantized value for exact matches on the maximum value instead of having the upper bound correspond to the upper bound of the last bucket. This means a 4-bit quantizer often only uses 15 / 16 possible values, effectively 3.9 bits.

Existing tests show significant movements for 4-bit quantization results; +9% recall in one of the unit tests.

The fix is to use 2^n - eps everywhere 2^n - 1 is used.

This would break existing stored indices, though, so a backwards compatibility fix is included; ranges are adjusted at deserialization time to simulate old behavior.

Differential Revision: D66909688

Summary: `TestScalarQuantizer.test_4variants_ivf` shouldn't mix IVF probing misses in with its evaluation. As is, it probes 4/64 centroids, so FP16 has only 73% recall. This fixes it to *still exercise residual encoding* and the resulting distributions, but exhaustively scan the index.

Differential Revision: D66909687
Summary:
Scalar quantizers have an off-by-one bug. Even when the quantizer is trained to cover the full trained data range, it only ends up encoding the maximum quantized value for exact matches on the maximum value instead of having the upper bound correspond to the upper bound of the last bucket. This means a 4-bit quantizer often only uses 15 / 16 possible values, effectively 3.9 bits. 

Existing tests show significant movements for 4-bit quantization results; +9% recall in one of the unit tests.

The fix is to use `2^n - eps` everywhere `2^n - 1` is used.

This would break existing stored indices, though, so a backwards compatibility fix is included; ranges are adjusted at deserialization time to simulate old behavior.

Differential Revision: D66909688
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66909688

@alexanderguzhva
Copy link
Contributor

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

@mdouze
Copy link
Contributor

mdouze commented Dec 16, 2024

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

I agree that the current bucket allocation is suboptimal.

The bucket training uses parameters RangeStat here
https://github.com/facebookresearch/faiss/blob/main/faiss/impl/ScalarQuantizer.h#L54

Could you define a new RangeStat entry rather than a new quantization QuantizationType?

@ddrcoder
Copy link
Contributor Author

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

That's addressed; the ranges are adapted to produce the same values as before. I do still need to prove that with a unit test, though.

@mdouze
Copy link
Contributor

mdouze commented Jan 8, 2025

Please address my comment above.
Only the training code should be need to be adapted, there is no reason to change the file format or the distance computation code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants