-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use stream in mul_add if given and allocator in subset_sum #438
Conversation
subset_sum(): use allocator if given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A couple nits, then this should be good to go.
pycuda/gpuarray.py
Outdated
@@ -2087,7 +2087,8 @@ def subset_sum(subset, a, dtype=None, stream=None, allocator=None): | |||
from pycuda.reduction import get_subset_sum_kernel | |||
|
|||
krnl = get_subset_sum_kernel(dtype, subset.dtype, a.dtype) | |||
return krnl(subset, a, stream=stream) | |||
return krnl(subset, a, stream=stream, | |||
allocator=drv.mem_alloc if allocator is None else allocator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should try to get the allocator off of one of the two arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could use a
's allocator.
On the other hand, looking at other reduction functions (sum, all, any,...), they all just pass allocator to the kernel, even if it's None. So maybe we should do that here for consistency.
(I actually added the if allocator is None...
by mistake - the kernel call works with allocator=None, it's only some functions like to_gpu which need a real allocator)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. Thanks!
Co-authored-by: Andreas Klöckner <inform@tiker.net>
Hi Andreas,
Somehow gpuarrays'
mul_add
did not use the suppliedstream
parameter. Similarly, theallocator
parameter ofsubset_sum
was not used either.This PR should fix that, and also adds an optional
out
parameter tomul_add
.