[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

DenisBakhvalov · 2020-07-30T23:19:26Z

We allow eliminating dead arguments in SYCL kernel functions
even if they have external linkage.

This patch also updates information about kernel arguments
in the integration header file.

We allow eliminating dead arguments in SYCL kernel functions even if they have external linkage. This patch also updates information about kernel arguments in the integration header file.

DenisBakhvalov · 2020-07-30T23:27:30Z

I think we can ignore clang-format-check fails since it will disrupt the current formatting in files and it will look odd:

        (void) llvm::createDeadArgEliminationPass();
 -      (void) llvm::createDeadArgEliminationSYCLPass();
 +      (void)llvm::createDeadArgEliminationSYCLPass();
        (void) llvm::createDeadCodeEliminationPass();

clin111 · 2020-07-31T01:57:57Z

llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp

+//   TODO: batch changes to multiple SYCL kernels and do one bulk update.
+constexpr StringLiteral OMIT_TABLE_BEGIN("// OMIT_TABLE_BEGIN");
+constexpr StringLiteral OMIT_TABLE_END("// OMIT_TABLE_END");
+static void updateIntegrationHeader(StringRef SyclKernelName,


Looks OK...hopefully the header file is short and/or the number of kernels is small, as we are writing the whole file on each update. Otherwise we have to go with a "F F F F" kind of table that can be modified in-place without changing the size.

I don't think it will make a difference in practice. But I'm working on a patch for doing a bulk update of int-header. I would like to do this change incrementally, i.e. in a separate patch.

llvm/include/llvm/Transforms/IPO/DeadArgumentElimination.h

llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp

bader · 2020-08-03T11:10:45Z

llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp

+  if (!IntHeaderBuffer)
+    report_fatal_error("unable to read integration header file '" +
+                       IntegrationHeaderFileName +
+                       "': " + IntHeaderBuffer.getError().message());


Are you going to replace this with assert(s)?
Another option to use LLVM_DEBUG and just skip the optimization.
Hard fail like this probably not the best approach for an optimization.

I think it is a hard fail. If we did not update int header it will result in a later runtime fail.

If we did not update int header it will result in a later runtime fail.

If we keep all arguments, it should be okay. I mean if anything is wrong with pre-requisites, this pass can just do nothing instead of crashing and it won't break anything.

Introduced an early exit if IntegrationHeaderFileName is not provided.

Great. I think we can replace all error checking with asserts in this file.
This way we can "hard fail" on build with enabled assertions and there is no overhead on internal consistency checking on the builds w/o assertions.

if (!<cond>) report_fatal_error(<msg>);

->

assert(<cond> & <msg>);

I don't think we should do thorough runtime checking for integration header format. This file is auto-generated by the compiler, so it's should be validated in scope of #2236. We can validate that file format is correct with asserts.

Feel free to address this in a separate PR.

llvm/test/Transforms/DeadArgElim/sycl-kernels-neg1.ll

llvm/test/Transforms/DeadArgElim/sycl-kernels-neg2.ll

llvm/test/Transforms/DeadArgElim/sycl-kernels.ll

bader · 2020-08-04T07:42:59Z

llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp

+  if (!IntHeaderBuffer)
+    report_fatal_error("unable to read integration header file '" +
+                       IntegrationHeaderFileName +
+                       "': " + IntHeaderBuffer.getError().message());


Great. I think we can replace all error checking with asserts in this file.
This way we can "hard fail" on build with enabled assertions and there is no overhead on internal consistency checking on the builds w/o assertions.

if (!<cond>) report_fatal_error(<msg>);

->

assert(<cond> & <msg>);

I don't think we should do thorough runtime checking for integration header format. This file is auto-generated by the compiler, so it's should be validated in scope of #2236. We can validate that file format is correct with asserts.

keryell · 2020-08-04T09:11:45Z

I am unsure to understand "Renamed SCYL kernels to SPIR kernels " ab6f272
While the tests are using SPIR, the code does not seem to be very specific to SPIR. Does this apply for CUDA PTX back-end or plain LLVM IR kernels for example?

bader · 2020-08-04T09:37:10Z

While the tests are using SPIR, the code does not seem to be very specific to SPIR. Does this apply for CUDA PTX back-end or plain LLVM IR kernels for example?

This pass is applied to the functions with SPIR_KERNEL calling conversion and when module triple contains sycldevice: https://github.com/intel/llvm/blob/sycl/llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp#L577

If you have better wording, we are open to changes.

keryell · 2020-08-04T10:21:45Z

Why would you not apply this optimization on any kind of SYCL kernels?
I am not familiar with the PTX compiler flow. Is it using SPIR internally and the triple is changed later by some LLVM passes after applying this DAE optimization?
Or is this a way to slow down non-SPIR devices compared to SPIR devices? :-)

bader · 2020-08-04T13:22:29Z

Why would you not apply this optimization on any kind of SYCL kernels?
I am not familiar with the PTX compiler flow. Is it using SPIR internally and the triple is changed later by some LLVM passes after applying this DAE optimization?

AFAIK, PTX compiler flow is also using "sycldevice" environment component, but I'm not sure about spir_kernel calling convention. Tagging @Naghasan. @keryell, what do you suggest using for "any kind of SYCL kernels" detection?

Or is this a way to slow down non-SPIR devices compared to SPIR devices? :-)

No. :-)

Naghasan · 2020-08-04T14:21:33Z

Why would you not apply this optimization on any kind of SYCL kernels?
I am not familiar with the PTX compiler flow. Is it using SPIR internally and the triple is changed later by some LLVM passes after applying this DAE optimization?

AFAIK, PTX compiler flow is also using "sycldevice" environment component

It does, it is used to trigger some SYCL specific transformations (local memory support and global offset).

but I'm not sure about spir_kernel calling convention. Tagging @Naghasan. @keryell, what do you suggest using for "any kind of SYCL kernels" detection?

Hitting a nerve here :-) AFAIK, only SPIR uses a calling convention to id a kernel entry point (along side metadata). For NVPTX, entry points are id by a metadata https://llvm.org/docs/NVPTXUsage.html#kernel-metadata and I think it requires to have the default CC (but don't quote me on this). There is a ptx_kernel CC, but it is not much used any more.

So I think this:

 bool FuncIsSpirKernel =
      CheckSpirKernels &&
      StringRef(F.getParent()->getTargetTriple()).contains("sycldevice") &&
      F.getCallingConv() == CallingConv::SPIR_KERNEL;

Could be turned into something like:

 bool FuncIsSYCLKernel =
      CheckSpirKernels &&
      StringRef(F.getParent()->getTargetTriple()).contains("sycldevice") &&
      isKernelFunction(F);

where isKernelFunction implements the logic for known backend. AFAIK there is no generic function that implement this generically.

Or is this a way to slow down non-SPIR devices compared to SPIR devices? :-)

No. :-)

I'm sure it is :-)

[DAE][SYCL] Enable DAE in SYCL kernel functions

39e166d

We allow eliminating dead arguments in SYCL kernel functions even if they have external linkage. This patch also updates information about kernel arguments in the integration header file.

DenisBakhvalov requested a review from bader as a code owner July 30, 2020 23:19

DenisBakhvalov requested review from erichkeane and kbobrovs July 30, 2020 23:20

clin111 reviewed Jul 31, 2020

View reviewed changes

erichkeane mentioned this pull request Jul 31, 2020

[SYCL] Enable parameter optimization for SYCL kernels #2236

Closed

bader reviewed Aug 3, 2020

View reviewed changes

DenisBakhvalov added 5 commits August 3, 2020 12:30

Skipped using temporary directory

18e7bd0

Fixed minor code review comment

5b0c754

Skip DAE if the integration header is not specified

e30b2bf

Renamed SCYL kernels to SPIR kernels

ab6f272

Fixed tests

bcbbbb1

DenisBakhvalov requested a review from bader August 3, 2020 21:57

bader approved these changes Aug 4, 2020

View reviewed changes

bader merged commit 0f33f7a into intel:sycl Aug 4, 2020

bader mentioned this pull request Aug 5, 2020

[DAE][SYCL] Emit MD instead of updating integration header #2258

Merged

bader mentioned this pull request Aug 22, 2020

[SYCL] Enable Dead Kernel Argument Elimination for non-SPIR target #2359

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

DenisBakhvalov commented Jul 30, 2020

DenisBakhvalov commented Jul 30, 2020

clin111 Jul 31, 2020

DenisBakhvalov Aug 3, 2020

bader Aug 3, 2020

DenisBakhvalov Aug 3, 2020

bader Aug 3, 2020

DenisBakhvalov Aug 3, 2020

bader Aug 4, 2020

bader Aug 4, 2020

bader Aug 4, 2020

keryell commented Aug 4, 2020

bader commented Aug 4, 2020

keryell commented Aug 4, 2020

bader commented Aug 4, 2020

Naghasan commented Aug 4, 2020

[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

Conversation

DenisBakhvalov commented Jul 30, 2020

DenisBakhvalov commented Jul 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell commented Aug 4, 2020

bader commented Aug 4, 2020

keryell commented Aug 4, 2020

bader commented Aug 4, 2020

Naghasan commented Aug 4, 2020