-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blockwise add and count in CUDA and SYCL #610
Add blockwise add and count in CUDA and SYCL #610
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing against this, I just fear that without any actual usage for the new functions in this PR, we wouldn't know if they work as they should or not. 🤔 I.e. I would've added these new functions in the PR in which they first get used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the tests in general, just have a lot of technical comments about them. 😛
tests/cuda/test_barrier.cu
Outdated
|
||
testBarrierCount<<<1, 1024>>>(out); | ||
|
||
ASSERT_EQ(cudaPeekAtLastError(), cudaSuccess); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a synchronous kernel execution you might as well do:
ASSERT_EQ(cudaPeekAtLastError(), cudaSuccess); | |
ASSERT_EQ(cudaDeviceSynchronize(), cudaSuccess); |
Maybe in addition to your call. Though for "synchronous kernels" I've rather been using cudaGetLastError()
instead. 🤔 For instance:
Oh sorry I only pushed these so I could run them through act on pcadp04, I'll turn this into a draft and finalize.
|
This commit extends our barrier types with the `blockAnd` and `blockCount` methods, the latter of which is a ballot vote across a work group. Implemented in both CUDA and SYCL.
This commit adds tests for the AND, OR, and COUNT methods on the barrier types for the CUDA and SYCL platforms.
fcbeff9
to
fbe62d8
Compare
Okay, this is ready to go now. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm overall very happy with how this looks.
The one thing that could be good to do for consistency is to move traccc::cuda::barrier
into a private header as well. Since I very much agree with how you moved the SYCL header.
This fixes an embarassing bug in acts-project#610 where I had apparently mass-changed `barrierCount` into `barrierOr`. Oops!
This fixes an embarassing bug in acts-project#610 where I had apparently mass-changed `barrierCount` into `barrierOr`. Oops!
This commit extends our barrier types with the
blockAnd
andblockCount
methods, the latter of which is a ballot vote across a work group. Implemented in both CUDA and SYCL.