Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ESIMD] Add set_kernel_properties API and use_double_grf property. #6182

Merged
merged 15 commits into from
Jun 17, 2022

Conversation

kbobrovs
Copy link
Contributor

@kbobrovs kbobrovs commented May 23, 2022

This patch:

  1. Adds esimd::set_kernel_properties API with the single supported property
    esimd::kernel_properties::use_double_grf, which lets compiler know that
    the calling kernel needs run in "double GRF" mode - more registers per
    thread at the expense of fewer H/W threads.
    This is temporary API until generic SYCL support for kernel properties
    is implemented:
    https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc
  2. Provides "lowering" of this API by the new LowerESIMDKernelProps pass, which marks such
    kernels with "esimd-double-grf" function attribute, and invoke it from the sycl-post-link as a part
    of ESIMD lowering.
  3. Implements new "dimension" of device code splitting in sycl-post-link:
    functions with and without "esimd-double-grf" attribute go to different
    modules. Device binary images resulting from "double-grf" modules are
    assigned the "isDoubleGRFEsimdImage" property
  4. Updates runtime to add "-doubleGRF" option when JITting SPIRV binaries
    with the "isDoubleGRFEsimdImage" property.
  5. Fixes sycl-post-link bug in ModuleSplitter.cpp:extractSubModule, where
    Function objects in the entry point list were not replaced with new
    Function objects in the cloned Module. This lead to corrupted symbol file in
    some cases.
  6. Misc refactoring:
  • factor out call graph traversal from LowerESIMD into ESIMDUtils to use from multiple sources
  • fix entry group and module properties handling in the light of multi-dimensional splitting
  • improve internal interfaces to pass single ModuleDesc instead of Module + entry points + properties
  • limit entry points of a ModuleDesc to the ModuleDesc it was split from

AOT compilation support is TBD.
Complementary E2E test PR intel/llvm-test-suite#1033

Signed-off-by: Konstantin S Bobrovsky konstantin.s.bobrovsky@intel.com

@kbobrovs kbobrovs requested review from a team as code owners May 23, 2022 05:02
@kbobrovs kbobrovs requested a review from againull May 23, 2022 05:02
@kbobrovs
Copy link
Contributor Author

@gmlueck, could you please review the interface part - https://github.com/intel/llvm/pull/6182/files#diff-7efdaf033502de5f8cc1ae48436f1f8b86b3b6f6ee5a6484c619053cb4753207 (review of other parts would also be appreciated).

// 3) This code (or the code in FE) must verify that slm_init or other such
// intrinsic is not called from another module because kernels in that other
// module would not get updated meta data attributes.
struct UpdateUint64MetaDataToMaxValue {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactors the original updateGenXMDNodes:

  1. Call graph traversal is factored out into traverseCallgraphUp above. This functor represents call graph action.
  2. It is slightly optimized to pre-select candidate nodes for fewer actions in the node action function.

This patch:
1) Adds esimd::set_kernel_properties API with the single supported property
   esimd::kernel_properties::use_double_grf, which lets compiler know that
   the calling kernel needs run in "double GRF" mode - more registers per
   thread at the expense of fewer H/W threads.
   This is temporary API until generic SYCL support for kernel properties
   is implemented:
   https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc
2) Provides "lowering" of this API by the LowerESIMD.cpp, which marks such
   kernels with "esimd-double-grf" function attribute.
3) Implements new "dimension" of device code splitting in sycl-post-link:
   functions with and without "esimd-double-grf" attribute go to different
   modules. Device binary images resulting from "double-grf" modules are
   assigned the "isDoubleGRFEsimdImage" property
4) Updates runtime to add "-doubleGRF" option when JITting SPIRV binaries
   with the "isDoubleGRFEsimdImage" property.
5) Fixes sycl-post-link bug in ModuleSplitter.cpp:extractSubModule, where
   Function objects in the entry point list were not replaced with new
   Function objects in the cloned Module. This lead to corrupted symbol file in
   some cases.

Signed-off-by: Konstantin S Bobrovsky <konstantin.s.bobrovsky@intel.com>
// https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc

template <class... KernelProps>
void set_kernel_properties(KernelProps... props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no extension document for this API, I assume it is an unsupported API that we can remove without any deprecation period. Is that your intent?

What is the __ESIMD_ENS namespace? Will users know that APIs in this namespace are unsupported? Should we instead define the API in a namespace named experimental?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no extension document for this API, I assume it is an unsupported API that we can remove without any deprecation period. Is that your intent?

We usually allow some deprecation period for ESIMD APIs being removed for users to adapt, even though it is experimental.

What is the __ESIMD_ENS namespace?

this is an alias for sycl::ext::intel::experimental::esimd

Will users know that APIs in this namespace are unsupported?

This is a good question. The main source of information about particular APIs for users should be the API docs doxygen https://intel.github.io/llvm-docs/doxygen/group__sycl__esimd.html. But it turns out it does not expand aliases, I we need to to replace aliases with normal nested namespace declaration style. Thanks for bringing this up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmlueck, this is fixed in #6197 (doxygen has an option to expand)

module_split::ModuleDesc ResMDesc = MSplit->nextSplit();
Module &ResM = ResMDesc.getModule();

bool SpecConstsMet = processSpecConstants(ResM);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part (lines 575-624) is now nested inside iteration over "double GRF" split w/o modifications (only renaming variables) - new lines 592-643

template <class... KernelProps>
void set_kernel_properties(KernelProps... props) {
// TODO check for duplicates
using Props = __MP11_NS::mp_list<KernelProps...>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the first time we use boost to implement something in ESIMD?
If Yes, then it introduces the dependency on it, which probably, is not good without a good reason. IMO, it would be better to re-write this code without boost to not introduce that dependency for such simple thing as walk through variadic pack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boost was imported quite a while ago to facilitate invoke_simd implementation and avoid spending resources on what boost already does. This is believed to be major productivity enhancement for SYCL API development. So this is first time for ESIMD, but not the first time for SYCL, so no new dependence is introduced. Note that boosl::mp11 appears as sycl::detail::boost::mp11 in SYCL API sources (it is imported and refactored automatically during the build)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5791

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what @v-klochkov suggested. Pure C++ solution looks simple enough: https://godbolt.org/z/fxv4jxPbb and if doesn't satisfy what needs to be done, then a comment explaining what and why would be very desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think pure C++ solution should be chosen over mp11-based one just because it does not use mp11. Code clarity and maintainability is more important here, and mp11 makes it better - list + for_each is the most clear possible, yet adding new features to the code like duplicate detection would be way easier with mp11 as well.

In general, I believe it is much more practical and safer to use mp11 for all routine tasks where SYCL headers conceptually use templates as usual data structures as in this case, even though in some cases simpler C++ is possible.

Adding @rolandschulz for possible opinion on mp11 usage direction.

Copy link
Contributor

@v-klochkov v-klochkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
I believe this PR will create conflicts with the other one that splits SYCL and ESIMD call-graphs, and thus require conflicts resolution + re-review/approval after that.

llvm/lib/SYCLLowerIR/ESIMD/LowerESIMD.cpp Outdated Show resolved Hide resolved
llvm/lib/SYCLLowerIR/ESIMD/LowerESIMD.cpp Outdated Show resolved Hide resolved
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Jun 1, 2022

@asudarsa, @v-klochkov, @againull - please review

v-klochkov
v-klochkov previously approved these changes Jun 1, 2022
@kbobrovs kbobrovs requested a review from v-klochkov June 9, 2022 09:16
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Jun 9, 2022

@v-klochkov, @asudarsa, @againull - please review

v-klochkov
v-klochkov previously approved these changes Jun 9, 2022
@asudarsa
Copy link
Contributor

No major issues. But it will be great if comments can be addressed. thanks

Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very sorry for delayed review:
sycl/source/detail/program_manager/program_manager.cpp looks good to me.

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link and other tools related changes look ok to me. Thanks

@againull againull merged commit 9a55da5 into intel:sycl Jun 17, 2022
@kbobrovs kbobrovs deleted the double_grf branch July 18, 2022 03:40
steffenlarsen pushed a commit that referenced this pull request Oct 13, 2022
This change extends Konst's work from
#6182 to work for any SYCL kernel, not
just ESIMD kernels

Basic summary of changes:
1) Move SYCL library set_kernel_properties function and related detail
header out of esimd code into generic SYCL code
2) Generalize SYCLLowerESIMDKernelPropsPass to make it work for SYCL
kernels
3) Change sycl-post-link module splitting to split non-ESIMD modules
that have any number of double GRF kernels
4) Change program loader to add the "-ze-opt-large-register-file" option
if the double GRF property is set. We do this instead of -doubleGRF
because -doubleGRF only works for the VC backend, while
-ze-opt-large-register-file works for both VC and scalar backends

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants