Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Doc] Design doc: optional kernel features #3781

Merged
merged 22 commits into from
Jul 27, 2021

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented May 18, 2021

Design document to conform to the SYCL 2020 "optional kernel feature"
behavior. This mostly covers SYCL 2020 section 5.7 "Optional kernel
features", but it also covers the design for the [[sycl::requires()]]
attribute.

First public review of the design document to conform to the SYCL 2020
"optional kernel features" behavior.  This mostly covers SYCL 2020
section 5.7 "Optional kernel features", but it also covers the design
for the `[[sycl::requires()]]` attribute.

Since a SPIR-V extension is proposed as part of this design, this
commit also includes a proposed extension to SPIR-V for enabling
code conditionally based on specialization constants.  The SPIR-V
extension is more general than required for the "optional kernel
features" design because it also includes functionality that will be
required for the SYCL_EXT_ONEAPI_DEVICE_IF extension to DPC++.
@gmlueck gmlueck requested review from bader, pvchupin and a team as code owners May 18, 2021 21:04
Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed up until #### Representation in SPIR-V

gmlueck and others added 3 commits May 21, 2021 09:45
Co-authored-by: kbobrovs <Konstantin.S.Bobrovsky@intel.com>
Co-authored-by: Artem Gindinson <artem.gindinson@intel.com>
Clarify wording about when `errc::kernel_not_supported` is thrown.
gmlueck added 2 commits May 21, 2021 16:49
Expand the design to include the case when device functions are
exported from a shared library, which is a new feature proposed
in intel#3210.
Address a code review comment asking how we can decorate `atomic_ref`
with `[[sycl::requires()]]` only when the type is 8 bytes.  Add an
appendix showing how this can be done using partial specialization.
@bader bader requested review from AGindinson and kbobrovs May 24, 2021 09:07
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only looked at OptionalDeviceFeatures.md file, will review for the proposed SPIR-V extension later

Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial SPIR-V extension review, will do at least one more iteration later. So far a lot of questions to get better understanding of intended behavior in some corner cases.

Copy link
Contributor

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience with me as this fell off my radar until @elizabethandrews pinged me on it. I've added some thoughts, but I may have more as this percolates in my brain.

@bader bader added enhancement New feature or request spec extension All issues/PRs related to extensions specifications SPIR-V Issues related to SPIRV-LLVM-Translator labels Jun 1, 2021
gmlueck added 2 commits June 1, 2021 15:14
* Clarify that the initial short list of requirements will be explained
  in detail later.
* Clarify what "static call tree" means when there are function pointers.
* Post-link tool now diagnoses an error if a `SYCL_EXTERNAL` function is
  incorrect decorated with `[[sycl::requires()]]`.
* Use `OpSpecConstantFalse` instead of `OpSpecConstantTrue`.
* Clarify that the new "SYCL/requirements" property list has properties
  of type `BYTE_ARRAY`, which is one of the existing property types.
AlexeySachkov
AlexeySachkov previously approved these changes Jul 15, 2021
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from DPC++ Tools point of view

Alexey Sachkov proposed an optimization for comparing the types in each
IR instruction with the `!intel_types_that_use_aspects` set.  Capture
this in a note.
@gmlueck gmlueck dismissed stale reviews from AlexeySachkov and kbobrovs via 29a2e9f July 15, 2021 12:42
AaronBallman
AaronBallman previously approved these changes Jul 20, 2021
Copy link
Contributor

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the latest draft of the document and it LGTM. @erichkeane may have additional questions or feedback, however.

gmlueck added 2 commits July 23, 2021 08:07
Rename the property set from "SYCL/image-requirements" to
"SYCL/device-requirements" to avoid confusion with the SYCL types
`unsampled_image` and `sampled_image`.
We are planning to change `[[sycl::requires()]]` in the SYCL 2020 spec
to omit the `has()` clause, which means the parameter to that attribute
will just be a list of aspects.  Make the same change to
`[[sycl::uses_aspects()]]` in this design document.

Also add a note that we plan to change the name of
`[[sycl::requires()]]` to `[[sycl::device_has()]]`.  We can update this
design document again once that change is adopted into the SYCL 2020
spec.
@gmlueck
Copy link
Contributor Author

gmlueck commented Jul 23, 2021

@AlexeySachkov, @kbobrovs, @AaronBallman: Requesting re-approval since my recent commits dismissed your previous approvals.

@erichkeane: Should I expect review from you as well, or is @AaronBallman covering for CFE?

@erichkeane
Copy link
Contributor

This has fallen pretty deeply on my list of things to do unfortunately, but if @AaronBallman and @elizabethandrews are happy, then so am I.

Copy link
Contributor

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gmlueck gmlueck requested a review from elizabethandrews July 23, 2021 19:38
@gmlueck
Copy link
Contributor Author

gmlueck commented Jul 23, 2021

This has fallen pretty deeply on my list of things to do unfortunately, but if @AaronBallman and @elizabethandrews are happy, then so am I.

In that case, asking also for an approval from @elizabethandrews.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DPC++ FE requirements LGTM

@gmlueck gmlueck requested a review from romanovvlad July 27, 2021 11:44
Comment on lines +85 to +86
exception must be raised synchronously from the kernel invocation command
(e.g. `parallel_for()`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor

Suggested change
exception must be raised synchronously from the kernel invocation command
(e.g. `parallel_for()`).
exception must be raised synchronously from the kernel submission command
(e.g. `submit()`).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requirement is from the SYCL 2020 specification:

Of course, applications that make use of optional kernel features should ensure that a kernel using such a feature is submitted only to a device that supports the feature. If the application submits a command group using a secondary queue, then any kernel submitted from the command group should use only features that are supported by both the primary queue’s device and the secondary queue’s device. If an application fails to do this, the implementation must throw a synchronous exception with the errc::kernel_not_supported error code from the kernel invocation command (e.g. parallel_for()).

https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features

In general, our philosophy is to diagnose runtime errors by throwing an exception from the API that causes the error whenever this is reasonably possible. We think this makes it easier for applications to know the cause of a failure. For example, if the exception is thrown from the submit() call, it might not be clear if the error is from paralle_for() or from constructing an accessor, etc.

Is there a reason why you think the exception should be delayed and thrown from submit() rather than from parallel_for()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you think the exception should be delayed and thrown from submit() rather than from parallel_for()?

It's related to how things are currently implemented. Currently we record everything that happens in the CG lambda(like "accessor created", "kernel execution requested") doing only basic API related checks, then when CG is executed we send this request to the scheduler component which may or may not construct low-level handles and enqueue low-level operations immediately(it depends on the situation). Most of the errors coming from the last step.

Also there are complains that queue::submit call is taking too much time of the calling thread, so eventually we might want to move as much as we can to a separate(dispatcher) thread. If we did so we would not be able to throw synchronous exceptions for most of things that currently happen during queue::submit anymore.

For example, if the exception is thrown from the submit() call, it might not be clear if the error is from paralle_for() or from constructing an accessor, etc.

CG lambda can have only one "action"(kernel execution or memcpy or ...), the exception "errc::kernel_not_supported" is quite specific, so I think it would be quite hard to not understand that it happens because queue::submit asks to execute an "incompatible" kernel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CG lambda can have only one "action"(kernel execution or memcpy or ...), the exception "errc::kernel_not_supported" is quite specific, so I think it would be quite hard to not understand that it happens because queue::submit asks to execute an "incompatible" kernel.

I agree that it would be clear in this case. However, there are other exceptions that could be thrown in the "submit" function, especially related to accessors. The SYCL spec expects these to be thrown from the API that causes the error (e.g. from the accessor constructor if the parameters to the accessor are invalid), not delayed and thrown from the queue::submit() function. These exceptions would not be clear if they were delayed because the submit function could construct many accessors.

Does DPC++ currently delay these exceptions also until the kernel submission function returns?

Also there are complains that queue::submit call is taking too much time of the calling thread, so eventually we might want to move as much as we can to a separate(dispatcher) thread.

I'm curious about what we do now in the calling thread. Does the JIT compilation happen currently in the calling thread, or is this already scheduled on a separate thread? Note that we were careful to design the errc::kernel_not_supported exception such that it could be diagnosed in the calling thread even before attempting to JIT compile the kernel, thus allowing the JIT compilation to happen on a different thread.

Comment on lines +171 to +172
Note that the exception must be thrown synchronously, not delayed and thrown on
the queue's asynchronous handler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify why this is a requirement?
In the current implementation it's easy to throw a synchronous exception, but it can be a
restriction for the implementation in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my response to your other comment.

Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the question about the exception type can be discussed separately.

@pvchupin
Copy link
Contributor

@gmlueck I can merge it now. Would you like to revisit and update PR description or should it be used for squashed commit message?
There've been a lot of changes during the review. I'd like to make sure we capture a good summary in commit.

@gmlueck
Copy link
Contributor Author

gmlueck commented Jul 27, 2021

@gmlueck I can merge it now. Would you like to revisit and update PR description or should it be used for squashed commit message?
There've been a lot of changes during the review. I'd like to make sure we capture a good summary in commit.

I'd like to change the comment you use for the squashed commit. How should I do that? Should I change the description of this PR, or type a new commit message in a comment for you to use, or something else?

@gmlueck
Copy link
Contributor Author

gmlueck commented Jul 27, 2021

@pvchupin: I updated the PR description, so this is ready to merge now.

@pvchupin pvchupin merged commit 88cfe16 into intel:sycl Jul 27, 2021
@gmlueck gmlueck deleted the gmlueck/optional-feature-design branch January 18, 2022 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request spec extension All issues/PRs related to extensions specifications SPIR-V Issues related to SPIRV-LLVM-Translator
Projects
None yet
Development

Successfully merging this pull request may close these issues.