Skip to content

[SYCL] Add new auto device code split mode #2827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexeySachkov
Copy link
Contributor

@AlexeySachkov AlexeySachkov commented Nov 27, 2020

This patch introduces new device code split mode auto, which is
intended to automatically select the best device code split mode and
apply it.

At the moment, auto is equivalent to per_source for most cases and
it is equivalent to off in case of precense of function pointers.

@AlexeySachkov
Copy link
Contributor Author

/summary:run

@AGindinson
Copy link
Contributor

AGindinson commented Nov 27, 2020

I would argue that the current title is somewhat misleading: before revising the actual changes, I've mistaken this for "always run the split unless -fsycl-device-code-split=off is specified, even if no option was passed at all". Maybe this variant would be a tad cleaner?

[SYCL] Enable a default mode for device code split

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add tests.

@AlexeySachkov
Copy link
Contributor Author

/summary:run

AGindinson
AGindinson previously approved these changes Dec 9, 2020
Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver part LGTM.

This patch introduces new device code split mode `auto`, which is
intended to automatically select the best device code split mode and
apply it.

At the moment, `auto` is equivalent to `per_source` for most cases and
it is equivalent to `off` in case of precense of function pointers.
Added LIT tests.
Update heuristic to disable device code split when indirect calls are
present in the input module.
@AlexeySachkov AlexeySachkov force-pushed the private/asachkov/auto-device-code-split branch from 4175932 to 73faebb Compare December 22, 2020 11:52
@AlexeySachkov AlexeySachkov changed the title [SYCL] Enable device code split by default [SYCL] Add new auto device code split mode Dec 22, 2020
@AlexeySachkov AlexeySachkov marked this pull request as ready for review December 22, 2020 11:54
@@ -289,6 +291,42 @@ enum KernelMapEntryScope {
Scope_Global // single entry in the map for all kernels
};

static KernelMapEntryScope selectDeviceCodeSplitScopeAutomatically(Module &M) {
// Here we can employ various heuristics to decide which way to split kernels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more comments then the code here :) I think the idea is pretty straightforward and the description could be shrinked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-structured comments a bit in edfa44c

if (IROutputOnly && DoSplit) {
errs() << "error: -" << SplitMode.ArgStr << " can't be used with -"
<< IROutputOnly.ArgStr << "\n";
if (IROutputOnly && (DoSplit && SplitMode != SPLIT_AUTO)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why SPLIT_AUTO is allowed with IR output only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why SPLIT_AUTO is allowed with IR output only?

Technically this is me being a bit lazy to dig further into the driver and do some changes there. Basically, there are compilation flows (related to AOT), where sycl-post-link is involved with -ir-output-only option and according to my changes, -split= is almost always passed to sycl-post-link (unless split was explicitly disabled) and it triggered errors in some LIT tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so the auto heuristics in presence of -ir-output-only is fixed to "don't split" - ? please add a comment then, to user-visible description too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so the auto heuristics in presence of -ir-output-only is fixed to "don't split" - ? please add a comment then, to user-visible description too

That's correct, updated help message of the tool in edfa44c

Co-authored-by: Artem Gindinson <artem.gindinson@intel.com>
Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver part LGTM.

@romanovvlad romanovvlad merged commit 184d258 into intel:sycl Dec 25, 2020
@AlexeySachkov AlexeySachkov deleted the private/asachkov/auto-device-code-split branch February 25, 2021 12:27
AGindinson pushed a commit to AGindinson/llvm that referenced this pull request Sep 7, 2021
Since `auto` value introduction for `-fsycl-device-code-split`
in commit 184d258 (intel#2827),
it stands as the default code split mechanism.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
bader pushed a commit that referenced this pull request Sep 8, 2021
Since `auto` value introduction for `-fsycl-device-code-split`
in commit 184d258 (#2827),
it stands as the default code split mechanism.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
DoyleLi pushed a commit to DoyleLi/intel_llvm that referenced this pull request Sep 13, 2021
Since `auto` value introduction for `-fsycl-device-code-split`
in commit 184d258 (intel#2827),
it stands as the default code split mechanism.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
jsji pushed a commit that referenced this pull request Nov 22, 2024
This improves SPV_KHR_untyped_pointers extension.
Removing struct type from special handling (translate as typed pointer)
allowed to fix `spirv-val` error in `CXX/global-ctor.cl` test:

```
error: line 88: OpFunctionCall Argument <id> '25[%this1]'s type does not
match Function <id> '11[%_ptr_Generic_class_Something]'s parameter type.
  %30 = OpFunctionCall %void %_ZNU3AS49SomethingC2Ei %this1 %26
```

Other changes allow to translate structs in a new way without violating
validation or test checks.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@15fd1cc50e12465
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants