[EXP][CUDA] Enable Large cluster sizes and fix cluster dimensions being set for dimensions less than 3#1765
[EXP][CUDA] Enable Large cluster sizes and fix cluster dimensions being set for dimensions less than 3#1765AD2605 wants to merge 11 commits intooneapi-src:mainfrom
Conversation
…luster launch is used
| if (workDim == 3) { | ||
| launch_attribute[i].value.clusterDim.x = |
There was a problem hiding this comment.
Could use some help Here -
I was not able to figure out where this flipping of order happens,
I see it's being set in setKernelParams but how it flips it I was not able to understand it
|
@oneapi-src/unified-runtime-cuda-write gentle ping r.e. this PR. lit.py: /home/test-user/actions-runners/01/_work/unified-runtime/unified-runtime/sycl-repo/sycl/test-e2e/lit.cfg.py:718: error: Cannot detect device aspect for cuda:gpu
stdout:
Platforms: 0
default_selector() : No device of requested type available. -1 (PI_ERRO...
accelerator_selector() : No device of requested type available. -1 (PI_ERRO...
cpu_selector() : No device of requested type available. -1 (PI_ERRO...
gpu_selector() : No device of requested type available. -1 (PI_ERRO...
custom_selector(gpu) : No device of requested type available. -1 (PI_ERRO...
custom_selector(cpu) : No device of requested type available. -1 (PI_ERRO...
custom_selector(acc) : No device of requested type available. -1 (PI_ERRO...Maybe re-triggering the job might help ? |
|
yeh CI seems to be broken currently, same here: https://github.com/oneapi-src/unified-runtime/actions/runs/9584385509/job/26430113727?pr=1774 |
|
Hi, thanks for your patch. |
…ABLE_CLUSTER_SIZE_ALLOWED flag being added
So the test cases that are being added, are in this PR here - intel/llvm#14113, which would test this PR fully. I do not suppose I can add a test which will check the ordering of the cluster dimensions in this UR PR, however I can change increase the cluster size in the tests added in #1643 and increase the cluster size, such that it tests However, note that this runs on SM_90 only, which I do not suppose is on the CI. If you prefer, I can add like a log of the test run on H100 ? |
|
I've removed the ready-to-merge label since intel/llvm#14113 isn't passing CI and also has the abi-break label - it will need to wait for the ABI breaking window to open before it can be merged. |
Thanks, that would be great! |
|
This was included in #1804 which has now merged |
Fix the ordering of cluster dimension in accordance to the grid Dimensions.
Also adds a call to
CU_FUNC_ATTRIBUTE_NON_PORTABLE_CLUSTER_SIZE_ALLOWEDto allowcluster sizes greater than 8