-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][NVPTX] Obey -fcuda-short-ptr when compiling SYCL for NVPTX #15642
Conversation
This makes pointers to CUDA shared, const, and local address spaces as being 32-bit pointers. This should bring decent performance improvements in certain programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! And it looks good to me but maybe other people from @intel/llvm-reviewers-cuda want to also double-check.
libclc/CMakeLists.txt
Outdated
@@ -450,10 +450,15 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} ) | |||
list(APPEND flags -D__unix__) | |||
endif() | |||
|
|||
set(spirv_flags ${flags}) | |||
if( ARCH STREQUAL nvptx OR ARCH STREQUAL nvptx64 ) | |||
list(APPEND spirv_flags -Xclang -fcuda-short-ptr -mllvm -nvptx-short-ptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok, but you could end up linking the libspirv (with short ptr) to a program compiled without. As builtins don't write pointers, I don't think this is an issue but would be good to test prior to merging and document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note I've now reduced the scope of this PR and the option is no longer enabled by default. Thus libclc/libspirv is no longer compiled with this option. If a user passes -fcuda-short-ptr
they'll see a warning while linking libclc/libspirv which I think is correct to do.
But yes in general we need to consider whether this is okay. I also think it's okay, and I've run several benchmarks with it and not seen a problem. Ideally we'd compile two versions of libspirv for NVPTX, but I don't know if it's going to be worth it for this relatively obscure option.
@intel/llvm-gatekeepers this is ready to merge, thank you |
This flag turns pointers to CUDA's
shared
,const
, andlocal
address spaces into 32-bit pointers. This can potentially save on registers used for addressing calculations.This option was being accepted by the frontend when compiling SYCL code, but was then reporting an error that the backend datalayout doesn't match the expected target description. This was because the option wasn't being caught by all parts of the toolchain, leading to inconsistencies.
This PR allows users to pass the option if they wish. They will see a warning that the compiler is linking against a libclc/libspirv that hasn't been compiled with this option, but this is likely harmless since libspirv doesn't manipulate pointers.