Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBSCAN utilize rbc eps_neighbors #5728

Merged
merged 29 commits into from
Mar 11, 2024

Conversation

mfoerste4
Copy link
Contributor

@mfoerste4 mfoerste4 commented Jan 23, 2024

This PR enables rbc eps-neighbor computation via raft. The resulting adjacency matrix is sparse and allows to skip the implicit conversion.

Notes:

  • the 'algorithm'-parameter was added to the DBSCAN init signature to allow the user to choose (default is 'brute', 'rbc' is optional)
  • the memory management is still very conservative, assuming a dense adjacency matrix and therefore selecting comparably small batches
  • in case maximum row length of a batch is sufficiently small the CSR structure can be computed in a single pass

CC @tfeher

@github-actions github-actions bot added Cython / Python Cython or Python issue CUDA/C++ labels Jan 23, 2024
@mfoerste4
Copy link
Contributor Author

rerun tests

@mfoerste4 mfoerste4 changed the title [Draft] DBSCAN utilize rbc eps_neighbors DBSCAN utilize rbc eps_neighbors Jan 30, 2024
@mfoerste4 mfoerste4 marked this pull request as ready for review January 30, 2024 19:28
@mfoerste4 mfoerste4 requested review from a team as code owners January 30, 2024 19:28
@mfoerste4 mfoerste4 changed the base branch from branch-24.02 to branch-24.04 February 6, 2024 22:37
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the PR! Looks great, I have just a few small suggestions for changes.

python/cuml/cluster/dbscan.pyx Outdated Show resolved Hide resolved
python/cuml/cluster/dbscan_mg.pyx Outdated Show resolved Hide resolved
python/cuml/cluster/dbscan_mg.pyx Show resolved Hide resolved
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the update, LGTM.

@mfoerste4
Copy link
Contributor Author

@tfeher , thanks for reviewing. I just added a small correction and re-triggered the CI.

@tfeher tfeher added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 28, 2024
@mfoerste4
Copy link
Contributor Author

/test

@mfoerste4
Copy link
Contributor Author

rerun tests

@dantegd
Copy link
Member

dantegd commented Mar 11, 2024

/merge

@rapids-bot rapids-bot bot merged commit a6c0478 into rapidsai:branch-24.04 Mar 11, 2024
59 checks passed
@mfoerste4 mfoerste4 deleted the rbc_eps_neighbors branch March 12, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA/C++ Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants