Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load ukernel bitcode as executable_object at the time of lowering to ukernels. #19323

Merged
merged 3 commits into from
Dec 3, 2024

Conversation

bjacob
Copy link
Contributor

@bjacob bjacob commented Nov 27, 2024

  1. Moves the time of loading ukernel bitcode from serializeExecutable to the GPULowerToUKernels pass.
  2. The determination of whether an op can lower to a ukernel, is now based on whether the expected bitcode file is found. This allows removing several utility functions that implemented similar logic in different places.
  3. The GPULowerToUKernels pass searches for existing bitcode in a hal.executable.objects attribute, and only loads the embedded ukernel bitcode if that wasn't found, and in either case ensures that that resulting ukernel op has a hal.executable.objects attribute containing the necessary IR. This has several nice implications:
    • The IR becomes completely self-contained: a ukernel op is no longer an opaque interface to some bitcode at-a-distance.
    • This solves the problem of allowing contributing one's own bitcode from the outside. Users can write their own hal.executable.objects.
    • De-duplication of bitcode is handled by the HoistExecutableObjects pass.
    • Linking bitcode is handled by generic linker code linking executable objects.
      • The only useful custom handling of ukernel symbols, was adding AlwaysInline function attributes. This PR moves these attributes to the ukernel source code: [[clang::always_inline]]. I verified that these result in the expected alwaysinline in the bitcode.
  4. The ukernel bitcode is part of the ROCM plugin. The serializeExecutable implementation, which was the consumer of that data, is also in the ROCM plugin. But the GPULowerToUKernels pass, which is the new consumer, is outside of that plugin. So this required creating a mechanism to export such embedded data files from the ROCM plugin to the outside. That is solved by the new EmbeddedDataDirectory utility.

@bjacob bjacob force-pushed the users/bjacob/execobj branch 7 times, most recently from 0f950fc to 8cba1a0 Compare December 2, 2024 17:25
@bjacob bjacob requested a review from kuhar December 2, 2024 18:58
@bjacob bjacob marked this pull request as ready for review December 2, 2024 18:58
@bjacob bjacob requested a review from raikonenfnu December 2, 2024 19:09
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

Copy link
Collaborator

@raikonenfnu raikonenfnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall refactoring looks great! just minor Q

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@bjacob bjacob force-pushed the users/bjacob/execobj branch from 402b9c1 to ea29fad Compare December 3, 2024 20:24
@bjacob bjacob requested a review from ScottTodd as a code owner December 3, 2024 20:24
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@bjacob bjacob requested a review from benvanik December 3, 2024 20:28
@bjacob bjacob merged commit cbb11f2 into main Dec 3, 2024
41 checks passed
@bjacob bjacob deleted the users/bjacob/execobj branch December 3, 2024 20:50
giacs-epic pushed a commit to giacs-epic/iree that referenced this pull request Dec 4, 2024
…o ukernels. (iree-org#19323)

1. Moves the time of loading ukernel bitcode from `serializeExecutable`
to the `GPULowerToUKernels` pass.
2. The determination of whether an op can lower to a ukernel, is now
based on whether the expected bitcode file is found. This allows
removing several utility functions that implemented similar logic in
different places.
3. The `GPULowerToUKernels` pass searches for existing bitcode in a
`hal.executable.objects` attribute, and only loads the embedded ukernel
bitcode if that wasn't found, and in either case ensures that that
resulting ukernel op has a `hal.executable.objects` attribute containing
the necessary IR. This has several nice implications:
- The IR becomes completely self-contained: a ukernel op is no longer an
opaque interface to some bitcode at-a-distance.
- This solves the problem of allowing contributing one's own bitcode
from the outside. Users can write their own `hal.executable.objects`.
- De-duplication of bitcode is handled by the HoistExecutableObjects
pass.
- Linking bitcode is handled by generic linker code linking executable
objects.
- The only useful custom handling of ukernel symbols, was adding
`AlwaysInline` function attributes. This PR moves these attributes to
the ukernel source code: `[[clang::always_inline]]`. I verified that
these result in the expected `alwaysinline` in the bitcode.
4. The ukernel bitcode is part of the ROCM plugin. The
`serializeExecutable` implementation, which was the consumer of that
data, is also in the ROCM plugin. But the `GPULowerToUKernels` pass,
which is the new consumer, is outside of that plugin. So this required
creating a mechanism to export such embedded data files from the ROCM
plugin to the outside. That is solved by the new `EmbeddedDataDirectory`
utility.

---------

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
Signed-off-by: Giacomo Serafini <179146510+giacs-epic@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants