-
Notifications
You must be signed in to change notification settings - Fork 810
[SYCL] Move Kernel specific data from handler_impl to a separate data structure
#19843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| auto *DynParamImpl = static_cast< | ||
| ext::oneapi::experimental::detail::dynamic_parameter_impl *>(Ptr); | ||
|
|
||
| MDynamicParameters.emplace_back(DynParamImpl, Index + IndexShift); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two functional changes in this PR (this line and line 241). Before these changes, the handler::registerDynamicParameter method was used to add a dynamic parameter. The handler::registerDynamicParameter method has two checks that might throw exceptions. Since the processArg function is moved from handler to the KernelData we cannot use handler::registerDynamicParameter. I also cannot move the registerDynamicParameter to the KernelData class because it requires to access Queue and Graph from the handler.
handler_impl to a separate data structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about something like this: https://godbolt.org/z/jjG5zsb5z
@vinser52 , @sergey-semenov , WDYT?
The link above introduces
struct KernelData { /* information about device kernel */ };
struct SubmissionInfo { /* information about what's being submitted, including KernelData */ };
I think there will be a need for at least one other data structure to pass from handler to scheduler that will be encapsulating SubmissionInfo along with some other information filled in handler/no-handler submission path before reaching either scheduler or direct UR enqueue APIs.
In general, I have something similar in mind. We have compile time data that is captured in header and passed to the
My idea was that |
d31cf35 to
b479d9b
Compare
b479d9b to
4c3b8fa
Compare
4c3b8fa to
a1c85b2
Compare
| const CompileTimeKernelInfoTy &Info) { | ||
| if (!isCompileTimeInfoSet()) | ||
| CompileTimeKernelInfoTy::operator=(Info); | ||
| assert(isCompileTimeInfoSet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sergey-semenov I removed this assert because in case of app was compiled with the 6.3 compiler, the kernel size is not passed from the headers to the runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @sergey-semenov, we decided to keep this assert in the preview mode.
Corresponding comment is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves more explanation (probably in the PR description). New KernelData is under source/detail and doesn't cross ABI boundary. What are we changing here that has ABI effects and why can't that be done in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I added comment in place. And the issue happens only in case of 6.3 or 6.2 headers. There is nothing about ABI here.
…in case of preview mode
reble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graph part LGTM
|
@intel/llvm-gatekeepers please consider merging |
|
Converting to draft to prevent a merge. I'll take a look at this later today. |
|
Also, a review is needed from @sergey-semenov |
| DynamicParametersVecT MDynamicParameters; | ||
|
|
||
| /// The list of arguments for the kernel. | ||
| std::vector<detail::ArgDesc> MArgs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what we do currently, but I think the goal should look more like
using HostKernelObjPtr = void *;
using ExplicitlySetArgs = <whatever>;
std::variant<HostKernelObjPtr, ExplicitlySetArgs> KernelParamsSource;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do it as a follow-up PR's. The goal of this PR just to split handler_impl and move some current logic/data to the KernelData and use KernelData in the no-handler flow.
| // TODO: remove in the next ABI-breaking window | ||
| // Today they are used only in the handler::setKernelNameBasedCachePtr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes little sense to me. Everything under source/detail/ is inside the libsycl.so and doesn't cross ABI boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These vars are used only when app is compiled with 6.3 compiler. In case of 6.3 compiler StoreLambda function calls setKerneInfo() that accept the compiler info from the integration header. After that StoreLambda calls setDeviceKernelInfoPtr. The problem with compatibility with 6.3 is that we need to store the data from integration header somewhere because DeviceKernelInfoPtr is not set yet.
You are right that this members does not cross ABI boundaries, but they are used only by the API that is under #ifndef __INTEL_PREVIEW_BREAKING_CHANGES macro. So I put them under preview, so that we don’t forget to remove them together with corresponding API in the next ABI-breaking window.
| const CompileTimeKernelInfoTy &Info) { | ||
| if (!isCompileTimeInfoSet()) | ||
| CompileTimeKernelInfoTy::operator=(Info); | ||
| assert(isCompileTimeInfoSet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves more explanation (probably in the PR description). New KernelData is under source/detail and doesn't cross ABI boundary. What are we changing here that has ABI effects and why can't that be done in a separate PR?
| #ifndef __INTEL_PREVIEW_BREAKING_CHANGES | ||
| , | ||
| bool IsESIMD | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update the caller to do
KerneDataObject.IsESIMD = IsESIMD;
KernelDataObject.proceesArg(/* no IsESIMD parameter*/);
instead? Or, in other words, do what we've done with the DeviceKernelInfoTy and have extra setting for the ABI compat entry points immediately after crossing the ABI boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we cannot because KernelData is not a flat data structure. The IsESIMD is stored in the DeviceKernelInfo and KernelData stores a pointer to it. In case of application is compiled with 6.2 headers the application calls processArg before the DeviceKernelInfo is assigned and we cannot create the device info at runtime because the kernel name is not assigned yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. That should be hacked on the handler side (where the problem is) and not here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let me check how it could be done on the handler side. I will try to do that.
sergey-semenov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. As agreed, I'm going to merge this to unblock PRs waiting for this one, but, naturally, feel free to continue the review after the merge.
…ta structure (intel#19843) This PR is a prerequisite for the handler-less API. Kernel-specific data and argument parsing logic are moved from the `handler_impl` to the new `KernelData` class that will be used in a handler-less path.
… updated in PR intel#19843. Doing so now. Without it there are CTS compat test issues Signed-off-by: Chris Perkins <chris.perkins@intel.com>
one of the overloads of extractArgsAndReqsFromLambda was to have been updated in PR #19843. Doing so now. Without it there are CTS compat test issues Signed-off-by: Chris Perkins <chris.perkins@intel.com>
This PR is a prerequisite for the handler-less API.
Kernel-specific data and argument parsing logic are moved from the
handler_implto the newKernelDataclass that will be used in a handler-less path.