Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][NVPTX] Obey -fcuda-short-ptr when compiling SYCL for NVPTX #15642

Merged
merged 7 commits into from
Dec 11, 2024

Conversation

frasercrmck
Copy link
Contributor

@frasercrmck frasercrmck commented Oct 9, 2024

This flag turns pointers to CUDA's shared, const, and local address spaces into 32-bit pointers. This can potentially save on registers used for addressing calculations.

This option was being accepted by the frontend when compiling SYCL code, but was then reporting an error that the backend datalayout doesn't match the expected target description. This was because the option wasn't being caught by all parts of the toolchain, leading to inconsistencies.

This PR allows users to pass the option if they wish. They will see a warning that the compiler is linking against a libclc/libspirv that hasn't been compiled with this option, but this is likely harmless since libspirv doesn't manipulate pointers.

This makes pointers to CUDA shared, const, and local address spaces as
being 32-bit pointers. This should bring decent performance improvements
in certain programs.
@hdelan
Copy link
Contributor

hdelan commented Oct 9, 2024

Copy link
Contributor

@konradkusiak97 konradkusiak97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! And it looks good to me but maybe other people from @intel/llvm-reviewers-cuda want to also double-check.

@@ -450,10 +450,15 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
list(APPEND flags -D__unix__)
endif()

set(spirv_flags ${flags})
if( ARCH STREQUAL nvptx OR ARCH STREQUAL nvptx64 )
list(APPEND spirv_flags -Xclang -fcuda-short-ptr -mllvm -nvptx-short-ptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok, but you could end up linking the libspirv (with short ptr) to a program compiled without. As builtins don't write pointers, I don't think this is an issue but would be good to test prior to merging and document this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I've now reduced the scope of this PR and the option is no longer enabled by default. Thus libclc/libspirv is no longer compiled with this option. If a user passes -fcuda-short-ptr they'll see a warning while linking libclc/libspirv which I think is correct to do.

But yes in general we need to consider whether this is okay. I also think it's okay, and I've run several benchmarks with it and not seen a problem. Ideally we'd compile two versions of libspirv for NVPTX, but I don't know if it's going to be worth it for this relatively obscure option.

@frasercrmck frasercrmck changed the title [SYCL][NVPTX] Enable -fcuda-short-ptr by default [SYCL][NVPTX] Obey -fcuda-short-ptr when compiling SYCL for NVPTX Dec 3, 2024
@frasercrmck frasercrmck requested a review from a team as a code owner December 3, 2024 12:51
@frasercrmck
Copy link
Contributor Author

@intel/llvm-gatekeepers this is ready to merge, thank you

@ldrumm ldrumm merged commit 83fe1c1 into intel:sycl Dec 11, 2024
14 checks passed
@frasercrmck frasercrmck deleted the sycl-nvptx-short-ptrs branch December 11, 2024 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants