Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Doc] Design specification for "device global" feature #4686

Merged
merged 28 commits into from
Jan 11, 2022

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented Oct 1, 2021

Add a design specification for the DPC++ extension API SYCL_EXT_ONEAPI_DEVICE_GLOBAL. This includes proposed SPIR-V and OpenCL extensions which are required for the implementation of this DPC++ feature.

Address review comments from @GarveyJoe.
Update the design to align with the latest version of the extension
API spec.  Also add specifications for underlying changes to SPIR-V
and OpenCL.
Update names and wording to make it clear the the new SPIR-V
decorations apply to all consumers, even though they are mostly useful
for FPGA.
Add some final clarifications to the design doc before creating PR.
Rename properties to align with latest review comments against API
spec.
Update the unresolved issue about diagnosing invalid `device_global`
declarations to reference the PR against the API spec that describes
exactly what these restrictions are.
@artemrad
Copy link
Contributor

artemrad commented Oct 5, 2021

In Design Spec in an Open you talk about avoid creating unique SPIR-V properties for each of the 3 properties listed, and leveraging some common mechanism. Are the decorations listed in the SPIR-V spec just there until that open gets resolved?

@gmlueck
Copy link
Contributor Author

gmlueck commented Oct 5, 2021

In Design Spec in an Open you talk about avoid creating unique SPIR-V properties for each of the 3 properties listed, and leveraging some common mechanism. Are the decorations listed in the SPIR-V spec just there until that open gets resolved?

No, I was not thinking that the SPIR-V decorations would change. Rather, I was wondering if there is some common infrastructure for propagating the properties from source code, through LLVM IR, and then result in the SPIR-V decorations that are specified.

@gmlueck
Copy link
Contributor Author

gmlueck commented Oct 6, 2021

@AlexeySachkov, @kbobrovs: you may also be interested in this design review. (I know that @romanovvlad has already seen it.)

Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the approach is quite complicated, but provided the API the extension defines I cannot suggest anything better. The only thing is that we could try to get rid of need in low level runtimes extensions by generating special kernel which do the same what suggest extension APIs do, but it might be less performant solution.

sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
We decided that we need some alternate way for the host to reference a
global variable in the kernel other than using the exported SPIR-V
name.  Extend the `HostAccessINTEL` decoration to take a new `Name`
parameter.  This allows the host to access even a non-exported
`OpVariable`.

Changes are still needed to the "DeviceGlobal.md" spec to explain how
this new field will be set by the compiler.  Those changes will be made
in a separate commit.
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I have some preliminary comments. Haven't taken a look at unresolved issues yet. Will do so shortly

sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
AlexeySachkov
AlexeySachkov previously approved these changes Nov 8, 2021
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link section LGTM

sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Outdated Show resolved Hide resolved
sycl/doc/DeviceGlobal.md Show resolved Hide resolved
Resolve the remaining open issues and address review comments:

* Describe how compile-time properties are handled, using the new
  compile-time property design document as a basis.

* Address remaining open issues, mostly with the front-end.

* Address review comments that have been made so far.
Remove some extraneous commentary about the `device_image_scope`
property.  Details of that property are describe more thoroughly in
the extension API specification.
Clarify the backend requirements for copying to / from a module scope
(global) variable.
@gmlueck
Copy link
Contributor Author

gmlueck commented Nov 15, 2021

My latest commits addressed all the open issues. I'm looking especially for a review from the front-end team (@elizabethandrews or @premanandrao).

@AlexeySachkov: there were a few updates to the sycl-post-link tool, so please review that section.

@romanovvlad: there are also a few updates to the runtime, so please review that also.

@bashbaug: can you (or someone you delegate) review the OpenCL and SPIR-V specs? I believe the changes here are all in line with what we discussed earlier.

Since PR intel#4697 was merged, there is no need to point to that PR.
Instead, just point to the extension specification for device global.
bashbaug
bashbaug previously approved these changes Dec 10, 2021
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPIR-V and OpenCL parts LGTM.

romanovvlad
romanovvlad previously approved these changes Dec 14, 2021
This addresses comments from the FPGA team.
GarveyJoe
GarveyJoe previously approved these changes Dec 17, 2021
@gmlueck
Copy link
Contributor Author

gmlueck commented Dec 17, 2021

@romanovvlad, @elizabethandrews, and @bashbaug: would you mind re-approving? I think the only changes since your last approval were a few tweaks to the SPIR-V spec (f0d9512 and 76b559c).

@AlexeySachkov: Could you also re-approve? There were a few changes to the sycl-post-link tool since your last approval.

AlexeySachkov
AlexeySachkov previously approved these changes Dec 23, 2021
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link part LGTM

bashbaug
bashbaug previously approved these changes Jan 4, 2022
Comment on lines +376 to +378
Assuming that no error diagnostic is issued, the `sycl-post-link` tool includes
the IR definition of each `device_global` variable in the modules that
reference that variable.
Copy link
Contributor

@AlexeySachkov AlexeySachkov Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmlueck, I've just realized that we might want to be more precise here. In particular, at the moment sycl-post-link clones all global variables into every produced device image:

// It's not easy to trace global variable's uses inside needed functions
// because global variable can be used inside a combination of operators, so
// mark all global variables as needed and remove dead ones after cloning.
for (const auto &G : M.globals()) {
GVs.insert(&G);
}

I assume that could be a problem, if we later produce device image properties with unused globals. At least we will waste some memory and probably compute time. What do you think?

P.S. I also don't think that it would be so complex to trace global variable usages if we don't need to differentiate types of the uses (i.e. reads vs writes). Traversing Use -> User chain until we encounter a function for which we already know the list of device images it will be cloned into should be relatively straightforward

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be correct to include all global variables in all modules. As you say, however, this will be wasteful. For example, consider some very large device global that is referenced by only one kernel. It would waste device memory to include this variable in all modules. This might be particularly problematic on FPGA, where device memory is more limited.

What about the current wording in the design doc should be made more precise? That wording already says that the IR for a device global should only be copied to a module if the module references the variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the current wording in the design doc should be made more precise? That wording already says that the IR for a device global should only be copied to a module if the module references the variable.

Maybe I'm misreading it, but I don't seem to find "should only be copied" part, i.e. I don't see the part which restricts sycl-post-link from doing extra copies.

It should be correct to include all global variables in all modules. As you say, however, this will be wasteful. For example, consider some very large device global that is referenced by only one kernel. It would waste device memory to include this variable in all modules. This might be particularly problematic on FPGA, where device memory is more limited.

Ok, thanks for confirming. I think this conversation can be marked as resolved

@gmlueck
Copy link
Contributor Author

gmlueck commented Jan 11, 2022

@bader: I think all stakeholders have approved. Can this be merged?

@bader
Copy link
Contributor

bader commented Jan 11, 2022

@bader: I think all stakeholders have approved. Can this be merged?

We recently added more checks to pre-commit validation and I'd like to make sure that this patch doesn't break documentation build. I suppose we need to add new sycl/doc/DeviceGlobal.md to sycl/doc/index.rst.
Could you merge sycl branch tip to gmlueck:gmlueck/device-global-design to trigger the latest pre-commit checks, please?

Current CI checks results are almost one month old and do not cover documentation build.

@gmlueck gmlueck requested a review from a team as a code owner January 11, 2022 16:23
@gmlueck
Copy link
Contributor Author

gmlueck commented Jan 11, 2022

We recently added more checks to pre-commit validation and I'd like to make sure that this patch doesn't break documentation build. I suppose we need to add new sycl/doc/DeviceGlobal.md to sycl/doc/index.rst. Could you merge sycl branch tip to gmlueck:gmlueck/device-global-design to trigger the latest pre-commit checks, please?

I did the merge and added "DeviceGlobal.md" to the "index.rst". There didn't seem to be any obvious order to the design documents in the index, so I put it last.

@bader bader merged commit 8c22ef1 into intel:sycl Jan 11, 2022
@gmlueck gmlueck deleted the gmlueck/device-global-design branch January 18, 2022 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.