Skip to content

[SYCL] Cast address spaces before replacing byval argument usages #1405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 27, 2020

Conversation

againull
Copy link
Contributor

For NVPTX target address space inference for kernel arguments and
allocas is happening in the backend (NVPTXLowerArgs and
NVPTXLowerAlloca passes). After frontend these pointers are in LLVM
default address space 0 which is the generic address space for NVPTX
target. Perform address space cast of a pointer to the shadow global
variable from the local to the generic address space before replacing
all usages of a byval argument.

Signed-off-by: Artur Gainullin artur.gainullin@intel.com

For NVPTX target address space inference for kernel arguments and
allocas is happening in the backend (NVPTXLowerArgs and
NVPTXLowerAlloca passes). After frontend these pointers are in LLVM
default address space 0 which is the generic address space for NVPTX
target. Perform address space cast of a pointer to the shadow global
variable from the local to the generic address space before replacing
all usages of a byval argument.

Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
@againull againull requested review from kbobrovs and bader March 27, 2020 06:33
@againull
Copy link
Contributor Author

This is a fix for #1291

@bader
Copy link
Contributor

bader commented Mar 27, 2020

+@Naghasan

@bader bader linked an issue Mar 27, 2020 that may be closed by this pull request
@bader bader added the cuda CUDA back-end label Mar 27, 2020
@bader bader merged commit c98559b into intel:sycl Mar 27, 2020
@againull againull deleted the cast_shadow branch December 3, 2022 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing function in hierarchical when targetting PTX
3 participants