Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid extra memory copy when using cp.concatenate in cuml.dask kmeans #5937

Merged
merged 4 commits into from
Jul 8, 2024

Conversation

dantegd
Copy link
Member

@dantegd dantegd commented Jun 15, 2024

Partial solution for #5936

Issue was that concatenating when having a single array per worker was causing a memory copy (not sure if always, but often enough). This PR avoids the concatenation when a worker has a single partition of data.

This is coming from a behavior from CuPy, where some testing reveals that sometimes it creates an extra allocation when concatenating lists that are comprised of a single array:

>>> import cupy as cp
>>> a = cp.random.rand(2000000, 250).astype(cp.float32) # Memory occupied: 5936MB
>>> b = [a]
>>> c = cp.concatenate(b) # Memory occupied: 5936 MB <- no memory copy
>>> import cupy as cp
>>> a = cp.random.rand(1000000, 250) # Memory occupied: 2120 MB
>>> b = [a]
>>> c = cp.concatenate(b) # Memory occupied: 4028 MB <- memory copy was performed!

I'm not sure what are the exact rules that CuPy follows here, we could check, but in general avoiding the concatenate when we have a single partition is an easy fix that will not depend on the behavior outside of cuML's code.

cc @tfeher @cjnolet

@github-actions github-actions bot added the Cython / Python Cython or Python issue label Jun 15, 2024
@dantegd dantegd changed the title void extra memory copy when using cp.concatenate Avoid extra memory copy when using cp.concatenate in cuml.dask kmeans Jun 15, 2024
@dantegd dantegd added bug Something isn't working non-breaking Non-breaking change labels Jun 15, 2024
@dantegd dantegd marked this pull request as ready for review June 15, 2024 23:07
@dantegd dantegd requested a review from a team as a code owner June 15, 2024 23:07
Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dantegd LGTM. This does fix one problem on the cuml side, but I believe it does not fully close #5936, because there still could happen a double allocation when the data is being distributed among the workers. That, however, is potentially a problem in dask / zict rather than in cuml.

Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I meant to approve not comment in my previous message)

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dantegd for the fix, LGTM!

@dantegd
Copy link
Member Author

dantegd commented Jun 24, 2024

/merge

@dantegd
Copy link
Member Author

dantegd commented Jul 8, 2024

/merge

@rapids-bot rapids-bot bot merged commit 50ec050 into rapidsai:branch-24.08 Jul 8, 2024
60 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants