-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve PQ computing distances #3150
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3150 +/- ##
==========================================
- Coverage 77.95% 77.94% -0.01%
==========================================
Files 242 242
Lines 81904 81910 +6
Branches 81904 81910 +6
==========================================
- Hits 63848 63846 -2
- Misses 14890 14892 +2
- Partials 3166 3172 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
FixedSizeListArray::try_new_from_values(codebook, DIM as i32).unwrap(), | ||
DistanceType::Cosine, | ||
); | ||
c.bench_function( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just run a loop over, i.e. for dt in [DistanceType::L2, Cosine, Dot]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
const PQ: usize = 96; | ||
const DIM: usize = 1536; | ||
const TOTAL: usize = 5 * 1024 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets all use more realistic numbers?, i.e., 8K - 16K, so that we can measure other piece i.e., table construction, and code transpose more realistically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense, done
); | ||
c.bench_function( | ||
format!("{},L2,4bitPQ={},DIM={}", TOTAL, PQ, DIM).as_str(), | ||
|b| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very curios what is this number compares to the 8 bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's slightly slower than 8bit
@@ -80,9 +80,11 @@ pub(super) fn compute_l2_distance( | |||
// so code[i * num_vectors + j] is the code of i-th sub-vector of the j-th vector. | |||
let num_vectors = code.len() / num_sub_vectors; | |||
let mut distances = vec![0.0_f32; num_vectors]; | |||
let num_centroids = 2_usize.pow(num_bits); | |||
// it must be 8 | |||
const NUM_CENTROIDS: usize = 2_usize.pow(8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: seems like this would be cleaner if we refactor compute_l2_distance
to a trait ProductQuantizedL2Distance<const BITS: usize>
. We can probably avoid branching in pq dist calculator this way too. Just a ticket for now is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i was thinking of this way as well, but 4bit PQ impl is diff from 8bit, so still 2 methods
another qq: would SQ benefit from the same optimization? Let's try it? |
no, distance computing for SQ is not with the same problem |
this is done by make the compiler know the size of distance table slice