Skip to content

[SYCL] Add splitting module capabilities when compiling for NVPTX and AMDGCN #4107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 29, 2021

Conversation

Naghasan
Copy link
Contributor

The patch allows modules splitting for NVPTX and AMDGCN targets.

As those path needs to rely on existing tool, the approach taken here is
to generate and intercept commands so they can be wrapped by a llvm-foreach tool call.

To do that, the patch adds ForEachWrappingAction which allows to encapsulate
a set of actions whose underlying commands will be wrapped under a llvm-foreach commands.

During the binding phases, the action will trigger the emission of jobs outside the set,
record the Jobs state then triggers the emission of the jobs inside the set.
The newly added commands are stolen from the Jobs list and
the commands are re-emitted with the llvm-foreach wrapping.

To allow standard clang tools to work properly,
SYCLPostLinkJobAction and FileTableTformJobAction reports
the underlying file type rather than TY_Tempfilelist or TY_tempfiletable
for those targets. Otherwise tools get confused as they don't know how to process the input.

To not make this patch too large, I maintained the SPIR-V path behaviour as it is, meaning the list of actions are not set up in the same way nor does it uses the ForEachWrappingAction. Otherwise there is a long list of test to update.

I'm open to do it (as part of this patch or a follow up), but I wanted to get feedback on this first.

@AlexeySachkov
Copy link
Contributor

Am I right that this will help to resolve #3833, because sycl-post-link will be launched in its "full" mode, producing device image properties and symbols list together with resulting device images?

@Naghasan
Copy link
Contributor Author

Am I right that this will help to resolve #3833,

Not tested or looked into it, but should as it makes use of the tools and frmat brought for SYCL.
But construction, this should also fix issues with application containing SPIR-V and PTX kernels.

Naghasan added 2 commits July 19, 2021 10:10
…ing binding.

The patch adds ForEachWrappingClass which allows to encapsulate
a set of actions whose underlying commands will be wrapped under a llvm-foreach commands.

During the binding phases, the action will trigger the emission of jobs outside the set,
record the Jobs state then triggers the emission of the jobs inside the set.
The newly added commands are stolen from the Jobs list and
the commands are re-emitted with the llvm-foreach wrapping.

Signed-off-by: Victor Lomuller <victor@codeplay.com>
…PTX or AMDGCN.

The patch allows modules splitting for NVPTX and AMDGCN targets.
The driver works by wrapping the actions with a ForEachWrappingAction.

To allow standard clang tools to work properly,
SYCLPostLinkJobAction and FileTableTformJobAction reports
the underlying file type rather than TY_Tempfilelist or TY_tempfiletable
for those targets.

Signed-off-by: Victor Lomuller <victor@codeplay.com>
@Naghasan Naghasan force-pushed the victor/nvptx-amdgcn-splitmodule branch from d792d55 to 84966b2 Compare July 19, 2021 10:30
Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the binding phases, the action will trigger the emission of jobs outside the set,
record the Jobs state then triggers the emission of the jobs inside the set.
The newly added commands are stolen from the Jobs list and
the commands are re-emitted with the llvm-foreach wrapping.

Overall, this approach LGTM personally - rather slick for the existing infrastructure! I have yet to review the "main" working code inside Driver.cpp, meanwhile, welcoming others to chime in with their review of the design (@mdtoguchi, @kbobrovs?)

To not make this patch too large, I maintained the SPIR-V path behaviour as it is, meaning the list of actions are not set up in the same way nor does it uses the ForEachWrappingAction. Otherwise there is a long list of test to update.

Fair point; that deserves a separate PR for sure (a TODO or a new GH issue could seal this part for the current PR). To my mind, having SPIR target printouts show the actual file extensions is worth the change, yet again, opinions from the architects would be definitive.

@bader bader requested a review from AGindinson July 26, 2021 19:22
bader
bader previously approved these changes Jul 26, 2021
Signed-off-by: Victor Lomuller <victor@codeplay.com>
bader
bader previously approved these changes Jul 27, 2021
@bader
Copy link
Contributor

bader commented Jul 27, 2021

@AGindinson, @mdtoguchi, could you take a look at this PR, please?

Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the typos; I'd appreciate an approval from @mdtoguchi

Co-authored-by: Artem Gindinson <artem.gindinson@intel.com>
Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, stumbled upon another typo. Also, is it possible to enable CUDA tests in https://github.com/intel/llvm-test-suite/blob/intel/SYCL/DeviceCodeSplit in a follow-up PR so that we have an additional E2E test happening?

@@ -737,7 +738,15 @@ class SYCLPostLinkJobAction : public JobAction {
void anchor() override;

public:
SYCLPostLinkJobAction(Action *Input, types::ID OutputType);
// The tempfiletable management relies on a shadowing the main file type by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The tempfiletable management relies on a shadowing the main file type by
// The tempfiletable management relies on shadowing the main file type by

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that using article in this case is incorrect.

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall - LGTM. Can you also update the device flow here: https://github.com/intel/llvm/blob/sycl/sycl/doc/images/DeviceLinkAndWrap.svg

@bader
Copy link
Contributor

bader commented Jul 29, 2021

I suggest we commit this fix and address documentation and additional testing comments in follow-up commit.

@bader bader merged commit c1324e6 into intel:sycl Jul 29, 2021
zahiraam pushed a commit to zahiraam/llvm-1 that referenced this pull request Aug 2, 2021
… AMDGCN (intel#4107)

The patch allows modules splitting for NVPTX and AMDGCN targets.

As those path needs to rely on existing tool, the approach taken here is
to generate and intercept commands so they  can be wrapped by a `llvm-foreach` tool call.

To do that, the patch adds `ForEachWrappingAction` which allows to encapsulate
a set of actions whose underlying commands will be wrapped under a llvm-foreach commands.

During the binding phases, the action will trigger the emission of jobs outside the set,
record the Jobs state then triggers the emission of the jobs inside the set.
The newly added commands are stolen from the Jobs list and
the commands are re-emitted with the llvm-foreach wrapping.

To allow standard clang tools to work properly,
SYCLPostLinkJobAction and FileTableTformJobAction reports
the underlying file type rather than TY_Tempfilelist or TY_tempfiletable
for those targets. Otherwise tools get confused as they don't know how to process the input.

To not make this patch too large, I maintained the SPIR-V path behaviour as it is, meaning the list of actions are not set up in the same way nor does it uses the `ForEachWrappingAction`. Otherwise there is a long list of test to update.

Signed-off-by: Victor Lomuller <victor@codeplay.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Program with device code in multiple translation units fails on CUDA
5 participants