Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] atomic_memory_order_capabilities query for device and context #8517

Merged

Conversation

andylshort
Copy link

@andylshort andylshort commented Mar 2, 2023

This patch implements the atomic_memory_order_capabilities query in the OpenCL and Level Zero backends/plugins for device and context

Specifically:

  • OpenCL <2.0 returns the minimum required capability set (relaxed) defined in Section 4.2 of the OpenCL 3.0 specification.
  • OpenCL <3.0 and Level Zero backends return all memory order capabilities.
  • OpenCL >=3.0 queries the actual device to get the supported memory order capabilities.

E2E test have also been updated to reflect these changes: intel/llvm-test-suite#1627

Lamzed-Short, Andrew added 3 commits February 28, 2023 07:07
This satsifies llvm-test-suite/SYCL/AtomicRef/atomic-memory-order.cpp if
you remove the "UNSUPPORTED OpenCL" test line, but doesn't implement the
other capabilities this could return. Further discussion is required
for further work to continue.
Minimal implementaton for now that calls similar query for each device
and combines them. Currently checks for memory_order::relaxed, others
coming soon.
sycl/plugins/opencl/pi_opencl.cpp Outdated Show resolved Hide resolved
sycl/plugins/opencl/pi_opencl.cpp Outdated Show resolved Hide resolved
sycl/plugins/opencl/pi_opencl.cpp Outdated Show resolved Hide resolved
sycl/plugins/opencl/pi_opencl.cpp Outdated Show resolved Hide resolved
@andylshort andylshort temporarily deployed to aws March 4, 2023 12:24 — with GitHub Actions Inactive
Lamzed-Short, Andrew added 4 commits March 8, 2023 08:45
Covering all current OpenCL versions
Return full set of them as in OCL 2.x
Instead of resolving it as its own plugin call with PI code, just call the
related one for each of its devices and intersect the results together.
@andylshort andylshort temporarily deployed to aws March 9, 2023 12:54 — with GitHub Actions Inactive
Lamzed-Short, Andrew added 2 commits March 9, 2023 05:13
Unsure whether ACQUIRE and RELEASE need removing as current pi.h expects
5 bits for memory order info but the OCL headers say only 3 are necessary
in 3.0 as the values are RELAXED, ACQ_REL, and SEQ_CST.
@andylshort andylshort temporarily deployed to aws March 9, 2023 14:09 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 9, 2023 14:42 — with GitHub Actions Inactive
ACQ_REL infers ACQUIRE and RELEASE, so adding them to the return value.
@andylshort
Copy link
Author

This will need to wait for #8595 to be merged first to resolve potential merge conflicts.

Lamzed-Short, Andrew added 2 commits March 10, 2023 08:00
Moved mask location, added bitwise operations to convert CL to PI enum
values, masked off relevant bits, and improved handling of out parameters.
@andylshort andylshort temporarily deployed to aws March 10, 2023 16:03 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 10, 2023 16:13 — with GitHub Actions Inactive
@andylshort andylshort force-pushed the alamzeds/atomic_mem_order_caps_fix branch from 40a96e4 to 58c53ad Compare March 10, 2023 16:43
@andylshort andylshort temporarily deployed to aws March 10, 2023 16:46 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 10, 2023 17:32 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 16, 2023 11:21 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 16, 2023 11:51 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 17, 2023 18:46 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 17, 2023 19:36 — with GitHub Actions Inactive
@andylshort andylshort marked this pull request as ready for review March 20, 2023 14:46
@andylshort andylshort requested review from a team as code owners March 20, 2023 14:46
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. LGTM!

@andylshort
Copy link
Author

I'm just testing a patch that addresses this comment from #8586 (comment) as there now are UR equivalents of the PI flags I needed, so just need to hold fire on the merge for a short while.

@andylshort andylshort temporarily deployed to aws March 23, 2023 11:31 — with GitHub Actions Inactive
@andylshort andylshort temporarily deployed to aws March 23, 2023 12:01 — with GitHub Actions Inactive
@AlexeySachkov AlexeySachkov dismissed their stale review March 23, 2023 13:40

My comments were resolved, dismissing that review

@AlexeySachkov AlexeySachkov merged commit b18e6ea into intel:sycl Mar 23, 2023
veselypeta pushed a commit to veselypeta/llvm that referenced this pull request Sep 21, 2023
…ntel#8517)

This patch implements the `atomic_memory_order_capabilities` query in
the OpenCL and Level Zero backends/plugins for `device` and `context`

Specifically: 
- OpenCL <2.0 returns the minimum required capability set (`relaxed`)
defined in [Section 4.2 of the OpenCL 3.0
specification](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES).
- OpenCL <3.0 and Level Zero backends return all memory order
capabilities.
- OpenCL >=3.0 queries the actual device to get the supported memory
order capabilities.

E2E test have also been updated to reflect these changes:
intel/llvm-test-suite#1627
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants