Skip to content

Conversation

@avik-pal
Copy link
Collaborator

@avik-pal avik-pal commented Dec 6, 2025

No description provided.

@avik-pal avik-pal force-pushed the ap/stablehlo_raising branch from 02ecfaa to 7f2a3d2 Compare December 6, 2025 18:06
@avik-pal avik-pal changed the title docs: add tutorial on raising loops [skip ci] docs: add tutorial on raising Dec 6, 2025
@avik-pal
Copy link
Collaborator Author

avik-pal commented Dec 7, 2025

needs EnzymeAD/Enzyme-JAX#1668 + corresponding jll


## Raising GPU Kernels

<!-- TODO: write this section -->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have written this and I can complete with a tutorial 5fe6e01

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah for ease @Pangoraw mind just pushing that commit to this branch here?

@avik-pal avik-pal force-pushed the ap/stablehlo_raising branch from 4624977 to 5a47dd6 Compare December 12, 2025 04:02
- Running the raised compute kernel on hardware where the original kernel was not designed to run on (_i.e._ running a CUDA kernel on a TPU).
- Enabling further optimizations, since the raised kernel is now indiscernible from the rest of the program, it can be optimized with it. For example, two sequential kernel launches operating on the result of each others can be fused if they are both raised. Resulting in a single kernel launch, in the final optimized StableHLO program.
- Lastly, automatic-differentiation in Reactant is currently not supported for GPU kernels. Raising kernels enables Enzyme to differentiate the raised kernel. For this to function, one must use the `raise_first` compilation option to make sure the kernel are raised before Enzyme performs automatic-differentiation on the program.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also add an example here as well [so folks dont get confused and think the scalar loop examples are part of gpu kernel raising]

@avik-pal avik-pal marked this pull request as ready for review December 12, 2025 04:14
@avik-pal avik-pal force-pushed the ap/stablehlo_raising branch from 739f1bf to db94b19 Compare December 12, 2025 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants