Skip to content

Commit 4624977

Browse files
authored
Update raising.md
1 parent 7f2a3d2 commit 4624977

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

docs/src/tutorials/raising.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,16 @@
22

33
## Raising GPU Kernels
44

5-
<!-- TODO: write this section -->
5+
Kernel raising refer to Reactant's ability to transform a program written in a GPU kernel style. That is, kernel functions which are evaluated in a grid of blocks and threads where operations are done at the scalar level. The transformation raises the program to a tensor style function (in the StableHLO dialect) where operations are broadcasted.
6+
7+
This transformation enables several features:
8+
9+
- Running the raised compute kernel on hardware where the original kernel was not designed to run on (_i.e._ running a CUDA kernel on a TPU).
10+
- Enabling further optimizations, since the raised kernel is now indiscernible from the rest of the program, it can be optimized with it. For example, two sequential kernel launches operating on the result of each others can be fused if they are both raised. Resulting in a single kernel launch, in the final optimized StableHLO program.
11+
- Lastly, automatic-differentiation in Reactant is currently not supported for GPU kernels. Raising kernels enables Enzyme to differentiate the raised kernel. For this to function, one must use the `raise_first` compilation option to make sure the kernel are raised before Enzyme performs automatic-differentiation on the program.
12+
13+
!!! note
14+
Not all classes of kernels are currently raisable to StableHLO. If your kernel encounters an error while being raised, please open an issue on [the Reactant.jl repository](https://github.com/EnzymeAD/Reactant.jl/issues/new?labels=raising).
615

716
## Raising Scalar Loops to Tensor IR
817

0 commit comments

Comments
 (0)