[SYCL][CUDA] Add sub-group barrier #2606

Pennycook · 2020-10-07T17:50:18Z

Uses __nvvm_bar_warp_sync, which is equivalent to CUDA __syncwarp().
Because sub-group functions must always be called in converged control flow,
the membermask is always set to represent all active work-items in the warp.

Enabling this functionality requires that we switch to PTX 6.4, which is
consistent with the existing requirement to use CUDA 10.1.

Signed-off-by: John Pennycook john.pennycook@intel.com

Uses __nvvm_bar_warp_sync, which is equivalent to CUDA __syncwarp(). Because sub-group functions must always be called in converged control flow, the membermask is always set to represent all active work-items in the warp. Enabling this functionality requires that we switch to PTX 6.4, which is consistent with the existing requirement to use CUDA 10.1. Signed-off-by: John Pennycook <john.pennycook@intel.com>

Signed-off-by: John Pennycook <john.pennycook@intel.com>

Pennycook · 2020-10-07T17:55:42Z

Thanks to @Naghasan and @bader for their help in getting this working.

Also, a note to reviewers: I had some trouble getting CMake to handle the additional PTX flags correctly. I'm not a CMake expert, and would welcome any suggestions regarding how to improve what I've committed here. The issue as I understand it is that the list of compilation options constructed in libclc/CMakeLists.txt is passed to two functions in AddLibclc.cmake, but each function consumes those options differently.

One passes the options to add_target_options, which unhelpfully strips the second -Xclang option if it isn't prefixed with SHELL:. The other passes the options directly to add_custom_command unmodified, leaving SHELL: in the command line. The best solution I could find was to write the options assuming that SHELL: was required, then strip them when they weren't necessary.

Naghasan · 2020-10-08T09:30:24Z

libclc/CMakeLists.txt

 				FILES generic/libspirv/sycldevice-binding.cpp)
 		endif()

 		add_libclc_builtin_set(libspirv-${arch_suffix}
 			TRIPLE ${t}
 			TARGET_ENV libspirv
-			COMPILE_OPT ${mcpu}
+			COMPILE_OPT ${flags}


COMPILE_OPT is a multi value option, so you should be able to add the extra flags directly.

A more long term solution would be perhaps to define flag per arch_sufix (they can then be accessed later), but should be for later I guess.

Naghasan · 2020-10-08T09:32:34Z

libclc/CMakeLists.txt

-			set( mcpu )
+			# FIXME: Ideally we would not be tied to a specific PTX ISA version
+			if( ${ARCH} STREQUAL nvptx OR ${ARCH} STREQUAL nvptx64 )
+				set( flags "SHELL:-Xclang -target-feature" "SHELL:-Xclang +ptx64")


Why using "SHELL: and string( REGEX REPLACE "SHELL:" later is needed ?

add_target_options only works if the SHELL: is there, but add_custom_command only works if the SHELL: is not there.

This is definitely a bit of a hack, but it seemed less error-prone than defining the same set of flags twice. If there's a more standard way to do this, please let me know and I'll fix it.

Makes sense. I'm no CMake expert so I'm not quite sure how to make it better.

When there is forward declaration of a spirv entry, its decorates are not translated until its definition is seen. Forward id is re-used for its entry. Id in entry decorates should use forward id as well. Original commit: KhronosGroup/SPIRV-LLVM-Translator@305f48884606abf

Rename urCommandBufferEnqueueExp to urEnqueueCommandBufferExp

…actor" This reverts commit cc60d08, from oneapi-src/unified-runtime#2606 due to CI fails in the DPC++ bump PR that need further investigation intel#16747

Revert "Merge pull request intel#2606 from Bensuo/cmd-buf_enqueue_refactor"

Rename urCommandBufferEnqueueExp to urEnqueueCommandBufferExp

This reverts commit cc60d08, from oneapi-src/unified-runtime#2606 due to CI fails in the DPC++ bump PR that need further investigation #16747

Revert "Merge pull request #2606 from Bensuo/cmd-buf_enqueue_refactor"

Pennycook added 2 commits October 7, 2020 13:44

[SYCL][CUDA] Enable sub-group barrier test

3b8dba1

Signed-off-by: John Pennycook <john.pennycook@intel.com>

Pennycook added enhancement New feature or request spec extension All issues/PRs related to extensions specifications cuda CUDA back-end labels Oct 7, 2020

Pennycook requested review from bader and a team as code owners October 7, 2020 17:50

Pennycook requested a review from againull October 7, 2020 17:50

bader approved these changes Oct 7, 2020

View reviewed changes

againull approved these changes Oct 7, 2020

View reviewed changes

bader merged commit 551d706 into intel:sycl Oct 8, 2020

Naghasan reviewed Oct 8, 2020

View reviewed changes

kbenzie added a commit to kbenzie/intel-llvm that referenced this pull request Feb 17, 2025

Merge pull request intel#2606 from Bensuo/cmd-buf_enqueue_refactor

cc60d08

Rename urCommandBufferEnqueueExp to urEnqueueCommandBufferExp

kbenzie added a commit to kbenzie/intel-llvm that referenced this pull request Feb 17, 2025

Merge pull request intel#2688 from Bensuo/revert_2606

98756a2

Revert "Merge pull request intel#2606 from Bensuo/cmd-buf_enqueue_refactor"

Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025

Merge pull request #2606 from Bensuo/cmd-buf_enqueue_refactor

05dd502

Rename urCommandBufferEnqueueExp to urEnqueueCommandBufferExp

Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025

Merge pull request #2688 from Bensuo/revert_2606

ea8bbf6

Revert "Merge pull request #2606 from Bensuo/cmd-buf_enqueue_refactor"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][CUDA] Add sub-group barrier #2606

[SYCL][CUDA] Add sub-group barrier #2606

Pennycook commented Oct 7, 2020

Pennycook commented Oct 7, 2020

Naghasan Oct 8, 2020

Naghasan Oct 8, 2020

Pennycook Oct 8, 2020

Naghasan Oct 8, 2020

[SYCL][CUDA] Add sub-group barrier #2606

[SYCL][CUDA] Add sub-group barrier #2606

Conversation

Pennycook commented Oct 7, 2020

Pennycook commented Oct 7, 2020

Naghasan Oct 8, 2020

Choose a reason for hiding this comment

Naghasan Oct 8, 2020

Choose a reason for hiding this comment

Pennycook Oct 8, 2020

Choose a reason for hiding this comment

Naghasan Oct 8, 2020

Choose a reason for hiding this comment