-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Improve Cython Lifetime Management by Adding References in DeviceBuffer
#661
[REVIEW] Improve Cython Lifetime Management by Adding References in DeviceBuffer
#661
Conversation
We should remove this test entirely. The overhead added to the result of |
The addition of Python attributes to My question is: does this open the possibility of Python's GC destroying the If we're sure that typical usage patterns won't result in reference cycles, then maybe this is not worth worrying about :) |
Sounds good. Using
That's a great question. The Cython docs have this second on how to handle that scenario. I couldn't completely rule out the possibility so I had considered adding
I am going to defer to you on this one. I couldn't think of a situation but I hardly am an advanced user. Additionally, it's worth considering if this may come up in the future given the rmm team's current plans. |
Not sure when I mentioned this too you. However, it's not that you shouldn't hold a reference to a Python stream object for a stream you don't own. It's just that if RMM doesn't own the underlying Does this PR overlap #650 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @mdemoret-nv
@@ -81,6 +84,10 @@ cdef class DeviceBuffer: | |||
if stream.c_is_default(): | |||
stream.c_synchronize() | |||
|
|||
# Save a reference to the MR and stream used for allocation | |||
self.mr = get_current_device_resource() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow passing a MR to the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I'd say let's not do that. I'd much rather we support multiple MRs with a context manager. That is, I prefer the first approach below to the second:
mr = MemoryResource(...)
with using_memory_resource(mr)
dbuf_1 = DeviceBuffer(...)
dbuf_2 = DeviceBuffer(...)
v/s
mr = MemoryResource(...)
dbuf_1 = DeviceBuffer(..., mr=mr)
dbuf_2 = DeviceBuffer(..., mr=mr)
There are arguments in favour of both though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context manager approach also follows the "One -- and preferably only one -- obvious way to do it" Zen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, Dask will call |
@mdemoret-nv your current PRs seem to have stalled. Today is code freeze for 0.18 so I'm going to push this to 0.19 |
Apologies, my Github notifications significantly increased in 0.18 and these PRs slipped through the cracks while I worked on cuml PRs. I think it would be better to get these PRs in a release cycle early anyways, just to ensure there are no unintended side effects. I will wrap these up once 0.18 has shipped. |
PR is ready for re-review/merging. |
Would like to let @shwina review again before merging |
rerun tests |
Going to merge this as the changes were minimal. Thanks @mdemoret-nv! |
@gpucibot merge |
#662) Depends on #661 While working on PR #661, it looked like it was possible to remove the "owning" C++ memory resource adaptors in `memory_resource_adaptors.hpp`. This PR is a quick implementation of what that would look like to run through CI and get feedback. The main driving factor of this PR is to eliminate the need for 2 layers of wrappers around every memory resource in the library. When adding new memory resources, C++ wrappers must be created in `memory_resource_adaptors.hpp` and Cython wrappers must be created in `memory_resource.pyx`, for any property/function that needs to be exposed at the python level. This removes the C++ wrappers in favor of using pythons reference counting for lifetime management. A few notes: 1. `MemoryResource` was renamed `DeviceMemoryResource` to more closely match the C++ class names. Easily can be changed back 1. Upstream MR are kept alive by a base class `UpstreamResourceAdaptor` that stores a single property `upstream_mr`. Any MR that has an upstream, needs to inherit from this class. 1. Once the `UpstreamResourceAdaptor` was created, most of the work/changes were updating the Cython imports to use the C++ classes instead of the C++ wrappers. 1. This should make it easier to expose more methods/properties at the python layer in the future. Would appreciate any feedback. Authors: - Michael Demoret (@mdemoret-nv) Approvers: - Christopher Harris (@cwharris) - Keith Kraus (@kkraus14) - Ashwin Srinath (@shwina) URL: #662
This PR adds support for CuPy streams in `rmm_cupy_allocator`. It works by getting CuPy's current stream and passing that to the `DeviceBuffer` constructor. There's also a fix for the casting of CuPy/Numba streams to `cudaStream_t`, it needs to be cast to `uintptr_t` first, without that the resulting pointer would be wrong and result in a segfault. Depends on #661 Authors: - Peter Andreas Entschev (@pentschev) - Keith Kraus (@kkraus14) Approvers: - Keith Kraus (@kkraus14) - @jakirkham - Mark Harris (@harrism) URL: #654
As discussed with @shwina @harrism and @kkraus14, this PR adds 2 properties to
DeviceBuffer
to allow for automatic reference counting ofMemoryResource
andStream
objects. This will prevent anyMemoryResource
from being destructed while anyDeviceBuffer
that needs the MR for deallocation is still alive.There are a few outstanding issues I could use input on:
test_rmm_device_buffer
is failing due to the line:sys.getsizeof(b) == b.size
. Need input on the best way forward.DeviceBuffer
is now involved in GC. Python automatically adds the GC memory overhead to__size__
(see here) which makes it difficult to continue working the same way it has before.@cython.no_gc
which is very risky.Stream
objects the same. @harrism mentioned only streams owned by RMM should be tracked this way but I am not sure if thats necessary or how to distinguish them at this point.Other than the above items, all test are passing and I ran this through the cuML test suite without any issues. Thanks for your help.