Skip to content

[EXP][CUDA] Add initial version of (kernel) Launch Properties extension.#1643

Merged
kbenzie merged 22 commits intooneapi-src:mainfrom
JackAKirk:test-kernel-launch-exp
May 28, 2024
Merged

[EXP][CUDA] Add initial version of (kernel) Launch Properties extension.#1643
kbenzie merged 22 commits intooneapi-src:mainfrom
JackAKirk:test-kernel-launch-exp

Conversation

@JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented May 21, 2024

Introduces an extendable extension for kernel launch properties and a full cuda implementation of the initial extension spec. Currently only three Launch Properties are supported on the cuda backend only:

  • UR_EXP_LAUNCH_PROPERTY_ID_CLUSTER_DIMENSION
  • UR_EXP_LAUNCH_PROPERTY_ID_COOPERATIVE
  • UR_EXP_LAUNCH_PROPERTY_ID_IGNORE (does nothing, but useful for testing).

Note UR_EXP_LAUNCH_PROPERTY_ID_IGNORE and UR_EXP_LAUNCH_PROPERTY_ID_COOPERATIVE can also be supported on l0 and hip in a later PR.

For more information you can read the accompanying extension .rst. Also see my comments here: #1610

Compared to the proposal from #1610 this PR uses a simpler implementation where we only introduce one new UR function.
I also switched from using opaque UR types to using UR properties that are defined for all backends (but wouldn't have to be supported for all backends).
I made this decision because even if we use opaque types, I think programmers will have to have different code versions for each backend if their underlying property types don't match. The way it is now future backend specific properties could be be added to the UR struct/enum property types.
I decided that this would be clearer and easier, and means we only need one new UR function instead of two, but we could always change this back if people have a different opinion, or if it turns out if other backends introduce new properties that it is better to use an opaque ur_exp_launch_property_t type.

JackAKirk added 5 commits May 20, 2024 18:57
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@github-actions github-actions bot added loader Loader related feature/bug conformance Conformance test suite issues. specification Changes or additions to the specification experimental Experimental feature additions/changes/specification cuda CUDA adapter specific issues labels May 21, 2024
JackAKirk added 3 commits May 22, 2024 11:57
- Move new function to enqueue.cpp.
- Fix impl/tests.
- Add device extension string.
- Clean up documentation.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
add match files.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk force-pushed the test-kernel-launch-exp branch from 1db3eee to 5059b50 Compare May 22, 2024 14:26
@JackAKirk JackAKirk changed the title [draft][exp] Test kernel attributes launch extension. [EXP][CUDA] Add kernel launch attributes extension. May 22, 2024
@JackAKirk JackAKirk marked this pull request as ready for review May 22, 2024 14:36
@JackAKirk JackAKirk requested review from a team as code owners May 22, 2024 14:36
@JackAKirk JackAKirk requested a review from steffenlarsen May 22, 2024 14:36
@JackAKirk
Copy link
Contributor Author

FYI this is highest priority.

Copy link
Contributor

@hdelan hdelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just a few points

}

// Preconditions
UR_ASSERT(hQueue->getContext() == hKernel->getContext(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd be more concerned that hQueue->getDevice() == hKernel->getProgram()->getDevice()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I forgot about the multi-context patch. I've now made this consistent with it.

UR_APIEXPORT ur_result_t UR_APICALL urEnqueueKernelLaunchCustomExp(
ur_queue_handle_t hQueue, ur_kernel_handle_t hKernel, uint32_t workDim,
const size_t *pGlobalWorkSize, const size_t *pLocalWorkSize,
uint32_t numAttrsInLaunchAttrList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think property lists in general are done with a linked list sort of approach, meaning you don't need numAttrsInLaunchAttrList. See here:

https://github.com/oneapi-src/unified-runtime/pull/1645/files#diff-dbbf71699d9e5cef43dc017eca1e433dc7b7ae229fcbfcb07662e3c63ea08426R9294

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah I think I should change this to make it consistent with the other properties.

Copy link
Contributor Author

@JackAKirk JackAKirk May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some test implementations and discussion we decided that whilst it would be nice to have a consistent properties protocol, the existing way is the best option atm, because:

  • the most common way of passing properties via a linked list of descriptor and properties would work, but there are a couple of small things that makes it not exactly natural for this interface/property type combination. There are existing interfaces in ur that also pass properties in different ways
  • The existing way is the most natural mapping to how cuda uses these properties (attributes)
  • This is experimental and we could potentially update it later: Once we know about other backend requirements we will probably have to update some aspects of the extension anyway.

However I have changed the naming from attributes -> properties, because in UR attributes is used for things like device queries, and "properties" is consistent with UR naming for passing in properties to functions.

JackAKirk added 6 commits May 22, 2024 17:00
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk changed the title [EXP][CUDA] Add kernel launch attributes extension. [EXP][CUDA] Add initial version of (kernel) Launch Properties extension. May 24, 2024
Copy link
Contributor

@hdelan hdelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA adapter LGTM. Thanks for changes

@JackAKirk
Copy link
Contributor Author

JackAKirk commented May 24, 2024

I just have to run the script again with main merged, and also I forgot to update a test directory/file names: attribute -> property

JackAKirk added 3 commits May 24, 2024 15:12
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
JackAKirk and others added 2 commits May 24, 2024 16:24
Use ${x} prefixes for ur types in .rst

Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
remove monospace for ${X} usage.

Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
@JackAKirk
Copy link
Contributor Author

JackAKirk commented May 27, 2024

@kbenzie could this be merged asap?
I've tested it on sm_90 now for the optional feature and it works.
Thanks

@JackAKirk JackAKirk added the ready to merge Added to PR's which are ready to merge label May 27, 2024
@kbenzie
Copy link
Contributor

kbenzie commented May 27, 2024

@kbenzie could this be merged asap?
I've tested it on sm_90 now for the optional feature and it works.
Thanks

Will do, there's one PR already in flight so will be after that.

@JackAKirk JackAKirk marked this pull request as draft May 27, 2024 14:24
@JackAKirk JackAKirk removed the ready to merge Added to PR's which are ready to merge label May 27, 2024
This is a superficial update missed previously.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk added the ready to merge Added to PR's which are ready to merge label May 27, 2024
@JackAKirk JackAKirk marked this pull request as ready for review May 27, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conformance Conformance test suite issues. cuda CUDA adapter specific issues experimental Experimental feature additions/changes/specification loader Loader related feature/bug ready to merge Added to PR's which are ready to merge specification Changes or additions to the specification

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments