Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blockwise add and count in CUDA and SYCL #610

Merged
merged 2 commits into from
Jun 18, 2024

Conversation

stephenswat
Copy link
Member

This commit extends our barrier types with the blockAnd and blockCount methods, the latter of which is a ballot vote across a work group. Implemented in both CUDA and SYCL.

@stephenswat stephenswat added feature New feature or request cuda Changes related to CUDA sycl Changes related to SYCL labels Jun 11, 2024
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing against this, I just fear that without any actual usage for the new functions in this PR, we wouldn't know if they work as they should or not. 🤔 I.e. I would've added these new functions in the PR in which they first get used.

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the tests in general, just have a lot of technical comments about them. 😛


testBarrierCount<<<1, 1024>>>(out);

ASSERT_EQ(cudaPeekAtLastError(), cudaSuccess);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a synchronous kernel execution you might as well do:

Suggested change
ASSERT_EQ(cudaPeekAtLastError(), cudaSuccess);
ASSERT_EQ(cudaDeviceSynchronize(), cudaSuccess);

Maybe in addition to your call. Though for "synchronous kernels" I've rather been using cudaGetLastError() instead. 🤔 For instance:

https://github.com/acts-project/vecmem/blob/main/tests/cuda/test_cuda_containers_kernels.cu#L59-L60

@stephenswat
Copy link
Member Author

stephenswat commented Jun 13, 2024 via email

This commit extends our barrier types with the `blockAnd` and
`blockCount` methods, the latter of which is a ballot vote across a work
group. Implemented in both CUDA and SYCL.
This commit adds tests for the AND, OR, and COUNT methods on the barrier
types for the CUDA and SYCL platforms.
@stephenswat stephenswat force-pushed the feat/barrier_and_count branch from fcbeff9 to fbe62d8 Compare June 18, 2024 12:39
@stephenswat
Copy link
Member Author

Okay, this is ready to go now. 👍

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm overall very happy with how this looks.

The one thing that could be good to do for consistency is to move traccc::cuda::barrier into a private header as well. Since I very much agree with how you moved the SYCL header.

@stephenswat stephenswat enabled auto-merge June 18, 2024 13:18
@stephenswat stephenswat merged commit 7cd882e into acts-project:main Jun 18, 2024
19 of 21 checks passed
stephenswat added a commit to stephenswat/traccc that referenced this pull request Jun 18, 2024
This fixes an embarassing bug in acts-project#610 where I had apparently
mass-changed `barrierCount` into `barrierOr`. Oops!
stephenswat added a commit to stephenswat/traccc that referenced this pull request Jun 18, 2024
This fixes an embarassing bug in acts-project#610 where I had apparently
mass-changed `barrierCount` into `barrierOr`. Oops!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda Changes related to CUDA feature New feature or request sycl Changes related to SYCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants