Skip to content

Conversation

@t4c1
Copy link
Contributor

@t4c1 t4c1 commented Nov 24, 2021

Fixes #5008. Fixes libclc implementation of CUDA atomics by using __nvvm_reflect() so as not to emit intrinsics that are unsupported on target sm version. If atomics are called with semantic order or scope that is not supported, __builtin_trap() is called instead.

@t4c1 t4c1 requested a review from bader as a code owner November 24, 2021 16:06
@bader
Copy link
Contributor

bader commented Nov 24, 2021

/verify with intel/llvm-test-suite#581

@bader
Copy link
Contributor

bader commented Nov 25, 2021

Basic/scalar_vec_access.cpp and Basic/stream/stream.cpp tests from https://github.com/intel/llvm-test-suite/ are failing with this change:

[2021-11-24T18:52:01.806Z] ******************** TEST 'LLVM :: Basic/scalar_vec_access.cpp' FAILED ********************
[2021-11-24T18:52:01.806Z] Script:
[2021-11-24T18:52:01.806Z] --
[2021-11-24T18:52:01.806Z] : 'RUN: at line 1';    /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin/clang++      -fLLVM -fLLVM-targets=nvptx64-nvidia-cuda /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/scalar_vec_access.cpp -o /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out
[2021-11-24T18:52:01.806Z] : 'RUN: at line 2';   true /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out
[2021-11-24T18:52:01.806Z] : 'RUN: at line 3';   true /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out
[2021-11-24T18:52:01.806Z] : 'RUN: at line 4';    env LLVM_DEVICE_FILTER=cuda:gpu,host  /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out | FileCheck /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/scalar_vec_access.cpp
[2021-11-24T18:52:01.806Z] : 'RUN: at line 5';   true /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out
[2021-11-24T18:52:01.806Z] --
[2021-11-24T18:52:01.806Z] Exit Code: 255
[2021-11-24T18:52:01.806Z] 
[2021-11-24T18:52:01.806Z] Command Output (stdout):
[2021-11-24T18:52:01.806Z] --
[2021-11-24T18:52:01.806Z] $ ":" "RUN: at line 1"
[2021-11-24T18:52:01.806Z] note: command had no output on stdout or stderr
[2021-11-24T18:52:01.806Z] $ "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin/clang++" "-fLLVM" "-fLLVM-targets=nvptx64-nvidia-cuda" "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/scalar_vec_access.cpp" "-o" "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/Output/scalar_vec_access.cpp.tmp.out"
[2021-11-24T18:52:01.806Z] # command stderr:
[2021-11-24T18:52:01.806Z] warning: linking module '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/lib/clang/14.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/lib/clang/14.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/scalar_vec_access.cpp' is 'nvptx64-nvidia-cuda'
[2021-11-24T18:52:01.806Z]  [-Wlinker-warnings]
[2021-11-24T18:52:01.806Z] 1 warning generated.
[2021-11-24T18:52:01.806Z] ptxas fatal   : Unresolved extern function '__nvvm_reflect'
[2021-11-24T18:52:01.806Z] llvm-foreach: 
[2021-11-24T18:52:01.806Z] clang-14: error: ptxas command failed with exit code 255 (use -v to see invocation)
[2021-11-24T18:52:01.806Z] Target: x86_64-unknown-linux-gnu
[2021-11-24T18:52:01.806Z] Thread model: posix
[2021-11-24T18:52:01.806Z] InstalledDir: /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin
[2021-11-24T18:52:01.806Z] clang-14: note: diagnostic msg: Error generating preprocessed source(s).
[2021-11-24T18:52:01.806Z] 
[2021-11-24T18:52:01.806Z] error: command failed with exit status: 255
[2021-11-24T18:52:01.806Z] 
[2021-11-24T18:52:01.806Z] --
[2021-11-24T18:52:01.806Z] 
[2021-11-24T18:52:01.806Z] ********************
[2021-11-24T18:52:12.821Z] ******************** TEST 'LLVM :: Basic/stream/stream.cpp' FAILED ********************
[2021-11-24T18:52:12.821Z] Script:
[2021-11-24T18:52:12.821Z] --
[2021-11-24T18:52:12.821Z] : 'RUN: at line 1';    /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin/clang++      -fLLVM -fLLVM-targets=nvptx64-nvidia-cuda /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/stream/stream.cpp -o /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/stream/Output/stream.cpp.tmp.out
[2021-11-24T18:52:12.821Z] : 'RUN: at line 4';   true /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/stream/Output/stream.cpp.tmp.out
[2021-11-24T18:52:12.821Z] : 'RUN: at line 5';   env LLVM_DEVICE_FILTER=cuda:gpu,host  /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/stream/Output/stream.cpp.tmp.out | FileCheck /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/stream/stream.cpp
[2021-11-24T18:52:12.821Z] : 'RUN: at line 6';   true /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/stream/Output/stream.cpp.tmp.out
[2021-11-24T18:52:12.821Z] --
[2021-11-24T18:52:12.821Z] Exit Code: 255
[2021-11-24T18:52:12.821Z] 
[2021-11-24T18:52:12.821Z] Command Output (stdout):
[2021-11-24T18:52:12.821Z] --
[2021-11-24T18:52:12.821Z] $ ":" "RUN: at line 1"
[2021-11-24T18:52:12.821Z] note: command had no output on stdout or stderr
[2021-11-24T18:52:12.821Z] $ "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin/clang++" "-fLLVM" "-fLLVM-targets=nvptx64-nvidia-cuda" "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/stream/stream.cpp" "-o" "/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/build/LLVM/Basic/stream/Output/stream.cpp.tmp.out"
[2021-11-24T18:52:12.821Z] # command stderr:
[2021-11-24T18:52:12.821Z] warning: linking module '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/lib/clang/14.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/lib/clang/14.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas '/localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm-test-suite/LLVM/Basic/stream/stream.cpp' is 'nvptx64-nvidia-cuda'
[2021-11-24T18:52:12.821Z]  [-Wlinker-warnings]
[2021-11-24T18:52:12.821Z] 1 warning generated.
[2021-11-24T18:52:12.821Z] ptxas fatal   : Unresolved extern function '__nvvm_reflect'
[2021-11-24T18:52:12.821Z] llvm-foreach: 
[2021-11-24T18:52:12.821Z] clang-14: error: ptxas command failed with exit code 255 (use -v to see invocation)
[2021-11-24T18:52:12.821Z] Target: x86_64-unknown-linux-gnu
[2021-11-24T18:52:12.821Z] Thread model: posix
[2021-11-24T18:52:12.821Z] InstalledDir: /localdisk2/iusers/nstester/Codeplay/workspace/LLVM_CI/intel/Lin/LLVM_Test_Suite_CUDA/llvm.obj/bin
[2021-11-24T18:52:12.821Z] clang-14: note: diagnostic msg: Error generating preprocessed source(s).
[2021-11-24T18:52:12.821Z] 
[2021-11-24T18:52:12.821Z] error: command failed with exit status: 255
[2021-11-24T18:52:12.821Z] 
[2021-11-24T18:52:12.821Z] --
[2021-11-24T18:52:12.821Z] 
[2021-11-24T18:52:12.821Z] ********************

It looks like we need to run NVVMReflect pass to resolve that issue. See https://llvm.org/docs/NVPTXUsage.html#common-issues.

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, fix ptxas fatal : Unresolved extern function '__nvvm_reflect' issue.

@t4c1
Copy link
Contributor Author

t4c1 commented Nov 30, 2021

So it looks like calling __nvvm_reflect from OpenCL is a bit broken. __nvvm_reflect requires an argument of type char*, while in OpenCL strings are in __constant AS. So we were declaring __nvvm_reflect(__constant char*), which works for uses in libclc, but breaks the uses of reflect in libdevice. I fixed this by declaring the string for reflect in IR instead.

@bader
Copy link
Contributor

bader commented Nov 30, 2021

/verify with intel/llvm-test-suite#590

@bader bader self-requested a review November 30, 2021 12:41
@t4c1 t4c1 changed the title [SYCL][CUDA][libclc] Fixes atomics for bellow sm_60 [SYCL][CUDA][libclc] Fixes atomics for below sm_60 Nov 30, 2021
@bader bader merged commit 00f43b3 into intel:sycl Dec 2, 2021
@t4c1 t4c1 deleted the fix_ptx_atomics_sm_50 branch March 15, 2022 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SYCL][CUDA] Fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta

2 participants