-
Notifications
You must be signed in to change notification settings - Fork 794
[SYCL][Fusion] API for kernel fusion library #7465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
be9e0fa to
0ff42c4
Compare
|
The design document for the overall implementation of kernel fusion can be found here. |
deebaf4 to
523c851
Compare
|
Since this is a new library we may need to determine who would be the best reviewers going forward for these. I'm going to add a few people. |
steffenlarsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple initial comments. I think I'll need a good cup of coffee for this one, so I will return in a couple of days.
sycl-fusion/common/include/Kernel.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to stay in sync with the definition in kernel_desc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, only Accessor (for internalization) and StdLayout (for constant propagation) receive special treatment by the JIT compilation process. The interface logic to this library in the SYCL runtime (part of a later PR) uses the SYCL RT internal ArgDesc directly, so additional members for this enum in kernel_desc would not cause errors in the JIT.
AlexeySachkov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a brief review, I haven't been able to grasp it in details yet
sycl-fusion/jit-compiler/lib/translation/SPIRVLLVMTranslation.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that there is an ongoing effort to allow less than 3 operands in reqd_work_group_size metadata, see #7450 (@steffenlarsen can link PRs to other components)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, I wasn't aware of that. We should be able to adapt our processing of the attributes once that PR is merged, to match the new behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this particular case KhronosGroup/SPIRV-LLVM-Translator#1726 is also relevant. Note that in the reverse-translator case you will always get 3 operands back, but I believe you also care about generated LLVM IR that hasn't gone through the translator, in which case you may need to adjust for optional parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint!
Right now, we only care about consuming SPIR-V. Untranslated LLVM IR will become relevant for targets that the DPC++ frontend currently does not compile to SPIR-V, e.g., targeting CUDA or AMD GPUs. Will keep this in mind for when we work on that.
sycl-fusion/jit-compiler/lib/translation/SPIRVLLVMTranslation.cpp
Outdated
Show resolved
Hide resolved
sycl-fusion/jit-compiler/lib/translation/SPIRVLLVMTranslation.cpp
Outdated
Show resolved
Hide resolved
First components for the JIT compiler used for SYCL kernel fusion: * API definitions * Input translation from SPIR-V to LLVM IR * Insertion of fused kernel function stub and metadata * Supporting infrastructure such as compiler options. Co-authored-by: Lukas Sommer <lukas.sommer@codeplay.com> Co-authored-by: Victor Perez <victor.perez@codeplay.com> Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
523c851 to
962c922
Compare
|
Thank you for the first reviews @steffenlarsen and @AlexeySachkov!
@pvchupin: I've added the Codeplay team that has been working on this implementation/library to |
steffenlarsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple more comments, but I think it looks good overall.
Note that the comment about the use of auto should probably also apply to a few other files. It doesn't have to be all uses of auto, for example X::create should be clear enough about the type being X, but for those that aren't necessarily clear about exactly what type it uses would be nice to have explicit.
Also, what are the plans for testing this? I realize much of this is just the skeleton with more to come, but is there any of this that might make sense to unit-test now or does it make more sense to wait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this particular case KhronosGroup/SPIRV-LLVM-Translator#1726 is also relevant. Note that in the reverse-translator case you will always get 3 operands back, but I believe you also care about generated LLVM IR that hasn't gone through the translator, in which case you may need to adjust for optional parameters.
Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
|
Thanks again for the feedback @steffenlarsen.
The
|
steffenlarsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @sommerlukas! LGTM!
|
Thanks for the approvals! Can either of you please merge this PR, I'm not authorized to merge this, @steffenlarsen @pvchupin? |
This is the third patch in a series of patches to add an implementation of the [kernel fusion extension](#7098). We have split the implementation into multiple patches to make them more easy to review. This patch integrates the kernel fusion extension into the SYCL runtime scheduler. Next to collecting the kernels submitted while in fusion mode in the fusion list associated with the queue, the integration into the scheduler is also responsible for detecting the synchronization scenarios. Various scenarios, such as buffer destruction or event wait, require fusion to be aborted early. The full list of scenarios is available in the [extension proposal](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_codeplay_kernel_fusion.asciidoc#synchronization-in-the-sycl-application). A high-level description of the integration into the scheduler can be found in the [design document](#7204). This PR can be reviewed and merged independently of #7465. Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com> Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
| SPIRVBinary &emplaceSPIRVBinary(std::string Binary); | ||
|
|
||
| private: | ||
| // FIXME: Change this to std::shared_mutex after switching to C++17. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have switched to C++17 as a minimal required version mid 2022.
This is the second patch in a series of patches to add an implementation of thekernel fusion extension. We have split the implementation into multiple patches to make them more easy to review.
This patch can be reviewed and merged independently of #7416.
This patch adds the first components for the JIT compiler used for implementation of kernel fusion at runtime , concretely:
CMake logic to link and code to invoke this JIT from the SYCL runtime will follow in a later patch.
Co-authored-by: Lukas Sommer lukas.sommer@codeplay.com
Co-authored-by: Victor Perez victor.perez@codeplay.com
Signed-off-by: Lukas Sommer lukas.sommer@codeplay.com