Skip to content

Conversation

@vinser52
Copy link
Contributor

This PR is a prerequisite for the handler-less API.
Kernel-specific data and argument parsing logic are moved from the handler_impl to the new KernelData class that will be used in a handler-less path.

auto *DynParamImpl = static_cast<
ext::oneapi::experimental::detail::dynamic_parameter_impl *>(Ptr);

MDynamicParameters.emplace_back(DynParamImpl, Index + IndexShift);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two functional changes in this PR (this line and line 241). Before these changes, the handler::registerDynamicParameter method was used to add a dynamic parameter. The handler::registerDynamicParameter method has two checks that might throw exceptions. Since the processArg function is moved from handler to the KernelData we cannot use handler::registerDynamicParameter. I also cannot move the registerDynamicParameter to the KernelData class because it requires to access Queue and Graph from the handler.

@vinser52 vinser52 changed the title [SYCL] Handler refactoring [SYCL] Move Kernel specific data from handler_impl to a separate data structure Aug 20, 2025
Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something like this: https://godbolt.org/z/jjG5zsb5z

@vinser52 , @sergey-semenov , WDYT?

The link above introduces

struct KernelData { /* information about device kernel */ };
struct SubmissionInfo { /* information about what's being submitted, including KernelData */ };

I think there will be a need for at least one other data structure to pass from handler to scheduler that will be encapsulating SubmissionInfo along with some other information filled in handler/no-handler submission path before reaching either scheduler or direct UR enqueue APIs.

@vinser52
Copy link
Contributor Author

vinser52 commented Aug 21, 2025

I was thinking about something like this: https://godbolt.org/z/jjG5zsb5z

@vinser52 , @sergey-semenov , WDYT?

The link above introduces

struct KernelData { /* information about device kernel */ };
struct SubmissionInfo { /* information about what's being submitted, including KernelData */ };

I think there will be a need for at least one other data structure to pass from handler to scheduler that will be encapsulating SubmissionInfo along with some other information filled in handler/no-handler submission path before reaching either scheduler or direct UR enqueue APIs.

In general, I have something similar in mind. We have compile time data that is captured in header and passed to the libsycl.so. And RT data, that could be split into two parts:

  1. Data that does not change between kernel invokations/submissions. Today we are storing this data in the KernelNameBasedCache
  2. Data that are specific to each particular invocation, like MArgs (we need to parse them on every submission).

My idea was that KernelData, which I introduced, would store all kernel-related data

@vinser52 vinser52 force-pushed the handler_refactoring branch from d31cf35 to b479d9b Compare August 21, 2025 16:02
@vinser52 vinser52 force-pushed the handler_refactoring branch from b479d9b to 4c3b8fa Compare August 22, 2025 10:40
@vinser52 vinser52 force-pushed the handler_refactoring branch from 4c3b8fa to a1c85b2 Compare August 28, 2025 15:34
const CompileTimeKernelInfoTy &Info) {
if (!isCompileTimeInfoSet())
CompileTimeKernelInfoTy::operator=(Info);
assert(isCompileTimeInfoSet());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergey-semenov I removed this assert because in case of app was compiled with the 6.3 compiler, the kernel size is not passed from the headers to the runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion with @sergey-semenov, we decided to keep this assert in the preview mode.
Corresponding comment is added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves more explanation (probably in the PR description). New KernelData is under source/detail and doesn't cross ABI boundary. What are we changing here that has ABI effects and why can't that be done in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I added comment in place. And the issue happens only in case of 6.3 or 6.2 headers. There is nothing about ABI here.

Copy link
Contributor

@reble reble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graph part LGTM

@github-actions
Copy link
Contributor

@intel/llvm-gatekeepers please consider merging

@aelovikov-intel aelovikov-intel marked this pull request as draft September 15, 2025 15:22
@aelovikov-intel
Copy link
Contributor

Converting to draft to prevent a merge. I'll take a look at this later today.

@vinser52
Copy link
Contributor Author

Also, a review is needed from @sergey-semenov

DynamicParametersVecT MDynamicParameters;

/// The list of arguments for the kernel.
std::vector<detail::ArgDesc> MArgs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what we do currently, but I think the goal should look more like

  using HostKernelObjPtr = void *;
  using ExplicitlySetArgs = <whatever>;
  std::variant<HostKernelObjPtr, ExplicitlySetArgs> KernelParamsSource;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do it as a follow-up PR's. The goal of this PR just to split handler_impl and move some current logic/data to the KernelData and use KernelData in the no-handler flow.

Comment on lines +227 to +228
// TODO: remove in the next ABI-breaking window
// Today they are used only in the handler::setKernelNameBasedCachePtr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes little sense to me. Everything under source/detail/ is inside the libsycl.so and doesn't cross ABI boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These vars are used only when app is compiled with 6.3 compiler. In case of 6.3 compiler StoreLambda function calls setKerneInfo() that accept the compiler info from the integration header. After that StoreLambda calls setDeviceKernelInfoPtr. The problem with compatibility with 6.3 is that we need to store the data from integration header somewhere because DeviceKernelInfoPtr is not set yet.

You are right that this members does not cross ABI boundaries, but they are used only by the API that is under #ifndef __INTEL_PREVIEW_BREAKING_CHANGES macro. So I put them under preview, so that we don’t forget to remove them together with corresponding API in the next ABI-breaking window.

const CompileTimeKernelInfoTy &Info) {
if (!isCompileTimeInfoSet())
CompileTimeKernelInfoTy::operator=(Info);
assert(isCompileTimeInfoSet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves more explanation (probably in the PR description). New KernelData is under source/detail and doesn't cross ABI boundary. What are we changing here that has ABI effects and why can't that be done in a separate PR?

Comment on lines +107 to +110
#ifndef __INTEL_PREVIEW_BREAKING_CHANGES
,
bool IsESIMD
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the caller to do

  KerneDataObject.IsESIMD = IsESIMD;
  KernelDataObject.proceesArg(/* no IsESIMD parameter*/);

instead? Or, in other words, do what we've done with the DeviceKernelInfoTy and have extra setting for the ABI compat entry points immediately after crossing the ABI boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we cannot because KernelData is not a flat data structure. The IsESIMD is stored in the DeviceKernelInfo and KernelData stores a pointer to it. In case of application is compiled with 6.2 headers the application calls processArg before the DeviceKernelInfo is assigned and we cannot create the device info at runtime because the kernel name is not assigned yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. That should be hacked on the handler side (where the problem is) and not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me check how it could be done on the handler side. I will try to do that.

Copy link
Contributor

@sergey-semenov sergey-semenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As agreed, I'm going to merge this to unblock PRs waiting for this one, but, naturally, feel free to continue the review after the merge.

@sergey-semenov sergey-semenov marked this pull request as ready for review September 16, 2025 12:08
@sergey-semenov sergey-semenov merged commit c8ec91f into intel:sycl Sep 16, 2025
28 checks passed
YixingZhang007 pushed a commit to YixingZhang007/llvm that referenced this pull request Sep 22, 2025
…ta structure (intel#19843)

This PR is a prerequisite for the handler-less API.
Kernel-specific data and argument parsing logic are moved from the
`handler_impl` to the new `KernelData` class that will be used in a
handler-less path.
cperkinsintel added a commit to cperkinsintel/llvm that referenced this pull request Sep 24, 2025
… updated in PR intel#19843.  Doing so now. Without it there are CTS compat test issues

Signed-off-by: Chris Perkins <chris.perkins@intel.com>
againull pushed a commit that referenced this pull request Sep 24, 2025
one of the overloads of extractArgsAndReqsFromLambda was to have been
updated in PR #19843. Doing so now. Without it there are CTS compat test
issues

Signed-off-by: Chris Perkins <chris.perkins@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants