Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acquire spill lock in to/from_arrow #13646

Merged

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Jun 29, 2023

Description

Currently, calling to_arrow will render a cuDF Series/DataFrame unspillable. I believe it's just an oversight that we don't call to/from_arrow in an acquire_spill_lock context. Neither of those methods should expose the device pointer to a cuDF object externally.

cc: @madsbk @vyasr -- could you take a look please?

In [1]: import cudf
In [2]: s = cudf.Series([1, 2, 3])
In [3]: cudf.core.buffer.spill_manager.get_global_manager()
Out[3]: <SpillManager spill_on_demand=True device_memory_limit=N/A | 0B spilled | 24B (0%) unspilled (unspillable)>
In [4]: s.to_arrow()
Out[4]:
<pyarrow.lib.Int64Array object at 0x7f31e8f7ba60>
[
  1,
  2,
  3
]
In [5]: cudf.core.buffer.spill_manager.get_global_manager()
Out[5]: <SpillManager spill_on_demand=True device_memory_limit=N/A | 0B spilled | 24B (100%) unspilled (unspillable)>

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Python Affects Python cuDF API. label Jun 29, 2023
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is just an oversight. Adding more acquire_spill_lock calls was always intended to be an ongoing task IIRC, although @madsbk can correct me if I've forgotten something.

Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, since cudf::to_arrow returns a arrow::Table that owns the data, we can spill.
Good catch @shwina

@madsbk madsbk added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Jun 30, 2023
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is resolving a real problem that needs to be fixed, just wanted to report here that it doesn't resolve the P2P shuffle issue we've seen and we still end up with errors such as below:

MemoryError: std::bad_alloc: CUDA error at: /datasets/pentschev/miniconda3/envs/cudf-invalid-address-src/include/rmm/mr/device/cuda_async_view_memory_resource.hpp:121: cudaErrorIllegalAddress an illegal memory access was encountered

With that said, the illegal address errors are probably due to something else.

@pentschev
Copy link
Member

cc @quasiben @rjzamora @VibhuJawa for awareness, it doesn't resolve the illegal address errors as I mentioned in #13646 (review) .

@shwina shwina marked this pull request as ready for review June 30, 2023 19:12
@shwina shwina requested a review from a team as a code owner June 30, 2023 19:12
@shwina
Copy link
Contributor Author

shwina commented Jun 30, 2023

/merge

@rapids-bot rapids-bot bot merged commit 2952f79 into rapidsai:branch-23.08 Jun 30, 2023
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants