[EXP][CUDA] Add initial version of (kernel) Launch Properties extension.#1643
[EXP][CUDA] Add initial version of (kernel) Launch Properties extension.#1643kbenzie merged 22 commits intooneapi-src:mainfrom
Conversation
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
- Move new function to enqueue.cpp. - Fix impl/tests. - Add device extension string. - Clean up documentation. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
add match files. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
1db3eee to
5059b50
Compare
|
FYI this is highest priority. |
source/adapters/cuda/enqueue.cpp
Outdated
| } | ||
|
|
||
| // Preconditions | ||
| UR_ASSERT(hQueue->getContext() == hKernel->getContext(), |
There was a problem hiding this comment.
Nit: I'd be more concerned that hQueue->getDevice() == hKernel->getProgram()->getDevice()
There was a problem hiding this comment.
Thanks, I forgot about the multi-context patch. I've now made this consistent with it.
source/adapters/cuda/enqueue.cpp
Outdated
| UR_APIEXPORT ur_result_t UR_APICALL urEnqueueKernelLaunchCustomExp( | ||
| ur_queue_handle_t hQueue, ur_kernel_handle_t hKernel, uint32_t workDim, | ||
| const size_t *pGlobalWorkSize, const size_t *pLocalWorkSize, | ||
| uint32_t numAttrsInLaunchAttrList, |
There was a problem hiding this comment.
I think property lists in general are done with a linked list sort of approach, meaning you don't need numAttrsInLaunchAttrList. See here:
There was a problem hiding this comment.
Thanks, yeah I think I should change this to make it consistent with the other properties.
There was a problem hiding this comment.
After some test implementations and discussion we decided that whilst it would be nice to have a consistent properties protocol, the existing way is the best option atm, because:
- the most common way of passing properties via a linked list of descriptor and properties would work, but there are a couple of small things that makes it not exactly natural for this interface/property type combination. There are existing interfaces in ur that also pass properties in different ways
- The existing way is the most natural mapping to how cuda uses these properties (attributes)
- This is experimental and we could potentially update it later: Once we know about other backend requirements we will probably have to update some aspects of the extension anyway.
However I have changed the naming from attributes -> properties, because in UR attributes is used for things like device queries, and "properties" is consistent with UR naming for passing in properties to functions.
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
hdelan
left a comment
There was a problem hiding this comment.
CUDA adapter LGTM. Thanks for changes
|
I just have to run the script again with main merged, and also I forgot to update a test directory/file names: attribute -> property |
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Use ${x} prefixes for ur types in .rst
Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
remove monospace for ${X} usage.
Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
|
@kbenzie could this be merged asap? |
Will do, there's one PR already in flight so will be after that. |
This is a superficial update missed previously. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Introduces an extendable extension for kernel launch properties and a full cuda implementation of the initial extension spec. Currently only three Launch Properties are supported on the cuda backend only:
Note
UR_EXP_LAUNCH_PROPERTY_ID_IGNOREandUR_EXP_LAUNCH_PROPERTY_ID_COOPERATIVEcan also be supported on l0 and hip in a later PR.For more information you can read the accompanying extension .rst. Also see my comments here: #1610
Compared to the proposal from #1610 this PR uses a simpler implementation where we only introduce one new UR function.
I also switched from using opaque UR types to using UR properties that are defined for all backends (but wouldn't have to be supported for all backends).
I made this decision because even if we use opaque types, I think programmers will have to have different code versions for each backend if their underlying property types don't match. The way it is now future backend specific properties could be be added to the UR struct/enum property types.
I decided that this would be clearer and easier, and means we only need one new UR function instead of two, but we could always change this back if people have a different opinion, or if it turns out if other backends introduce new properties that it is better to use an opaque
ur_exp_launch_property_ttype.