-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to use cuVS for vector search #6085
Conversation
This reverts commit cbb79ec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took an early peek at the packaging side of this change, since this will probably be coming down close to freeze. Ping if you need help with any build issues / etc.
We don't currently have cosine distance for ivf-pq (see rapidsai/cuvs#346) and we also don't have correlation distance support at all. re-add the metricprocessor code to handle this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes and everything looks good to me, pending returning raft to branch-24.12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great! I'm going to pre-approve just to get this in for the release, but don't forget to revert the explicit cmake pinning.
@@ -14,6 +14,7 @@ dependencies: | |||
- cudatoolkit | |||
- cudf==24.10.*,>=0.0.0a0 | |||
- cupy>=12.0.0 | |||
- cuvs==24.10.*,>=0.0.0a0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think cuml needs a dependency on cuvs python today. We could keep this for the release, though, just to avoid additional changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are correct here (we only need cuvs
for the requirements/pyproject outputs for wheels). Let's file a follow-up PR once this one merges so we don't lose track. It can target 24.12.
Thankfully, cuvs
is only ~350 kB: https://anaconda.org/rapidsai-nightly/cuvs/files
@@ -18,7 +18,7 @@ | |||
|
|||
#include <cuml/common/log_levels.hpp> | |||
|
|||
#include <raft/cluster/kmeans_types.hpp> | |||
#include <cuvs/cluster/kmeans.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
@@ -24,8 +24,8 @@ | |||
#include <cuml/cluster/hdbscan.hpp> | |||
#include <cuml/common/logger.hpp> | |||
|
|||
#include <raft/cluster/detail/agglomerative.cuh> | |||
#include <raft/cluster/detail/mst.cuh> | |||
#include <raft/cluster/detail/agglomerative.cuh> // build_dendrogram_host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh boy- detail APIs. Well need to make sure we add this to the tracker list. The single-linkage implementation from RAFT has been migrated to cuVS so we should be able to use that. No need to do that for 24.10, though.
@@ -1,818 +0,0 @@ | |||
/* | |||
* Copyright (c) 2020-2024, NVIDIA CORPORATION. | |||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful to see this go. Eventually, we should start building distributed Python implementations right in cuVS for things like cuvs-dask (and cuvs-ray, etc...).
// fully serial execution. | ||
raft::resource::sync_stream_pool(handle); | ||
|
||
if (input.size() > 1 || translations != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the most ideal case, we should be able to formalize an "id_mapper" API at some point and have it used consistently for all index types in cuVS itself. That's a pretty trivial gather operation that could be accelerated.
readd the pin to check to see if changes in https://github.com/rapidsai/cuvs/compare/branch-24.10...benfred:static_lib2?expand=1 resolve wheel build issues.
@@ -14,6 +14,7 @@ dependencies: | |||
- cudatoolkit | |||
- cudf==24.10.*,>=0.0.0a0 | |||
- cupy>=12.0.0 | |||
- cuvs==24.10.*,>=0.0.0a0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are correct here (we only need cuvs
for the requirements/pyproject outputs for wheels). Let's file a follow-up PR once this one merges so we don't lose track. It can target 24.12.
Thankfully, cuvs
is only ~350 kB: https://anaconda.org/rapidsai-nightly/cuvs/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I previously reviewed all the build/packaging changes, and am approving the new diff since my last look. I did not review the C++ code. Thanks for the hard work on this!
We need to do some planning on all the follow-up work that is needed to simplify cuVS and cuML. The build/packaging side of things is getting messy in both repositories with all the short term fixes.
/merge |
This PR updates to use cuVS instead of RAFT for vector search, pairwise distances and clustering. This is required for us to deprecate the vector search functionality in RAFT, in favour of the code in cuVS.
Because some code hasn't been migrated over to cuvs yet, we will continue to use the version in RAFT - but with RAFT in header only mode. In particular this functionality will be used in RAFT header only mode:
Because sparse KNN in RAFT uses the DistanceType in RAFT, we can't fully move over to use the DistanceType code in cuVS with this PR. (Also the DistanceType code in RAFT has a
Precomputed
option that isn't available in cuvs - but is needed by cuml for dbscan.) This means that we have both the raft and cuvs DistanceType enum's in use with this change, with conversions between them.