Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem compiling example in DPC++ book #3346

Closed
jomoga opened this issue Mar 11, 2021 · 13 comments
Closed

Problem compiling example in DPC++ book #3346

jomoga opened this issue Mar 11, 2021 · 13 comments
Labels
bug Something isn't working cuda CUDA back-end

Comments

@jomoga
Copy link

jomoga commented Mar 11, 2021

I am attempting to compile the example in Fig 2.7 from the DPC++ book by Reinders et al. Here is the code:

#include <CL/sycl.hpp>
#include <iostream>
using namespace sycl;

int main() {
   // Create queue on whatever default device that the implementation
   // chooses. Implicit use of the default_selector.
   queue Q;

   std::cout << "Selected device: " <<
   Q.get_device().get_info<info::device::name>() << "\n";

   return 0;
}

I compile it with:

clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -fsycl-unnamed-lambda fig-2.7.cpp -o fig-2.7

which give the following output:

clang-13: /home/joel/sycl_workspace/llvm/llvm/lib/Target/NVPTX/SYCL/LocalAccessorToSharedMemory.cpp:51: virtual bool (anonymous namespace)::LocalAccessorToSharedMemory::runOnModule(llvm::Module &): Assertion `NvvmMetadata && "IR compiled to PTX must have nvvm.annotations"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /home/joel/sycl_workspace/llvm/build/bin/clang-13 -cc1 -triple nvptx64-nvidia-cuda-sycldevice -fsycl -fsycl-is-device -fdeclare-spirv-builtins -aux-triple x86_64-unknown-linux-gnu -Wno-sycl-strict -sycl-std=2020 -S -disable-free -main-file-name fig-2.7.cpp -mrelocation-model static -mframe-pointer=all -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem /home/joel/sycl_workspace/llvm/build/bin/../include/sycl -mlink-builtin-bitcode /home/joel/sycl_workspace/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc -mlink-builtin-bitcode /usr/local/cuda-11.1/nvvm/libdevice/libdevice.10.bc -target-feature +ptx71 -target-sdk-version=11.1 -target-cpu sm_50 -fno-split-dwarf-inlining -debugger-tuning=gdb -resource-dir /home/joel/sycl_workspace/llvm/build/lib/clang/13.0.0 -fno-dwarf-directory-asm -fdebug-compilation-dir /home/joel/sycl_test -ferror-limit 19 -fgnuc-version=4.2.1 -fcolor-diagnostics -fsycl-unnamed-lambda -o /tmp/fig-2-825a7c.s -x ir /tmp/fig-2-359f3e.bc

  1. Code generation
  2. Running pass 'localaccessortosharedmemory' on module '/tmp/fig-2-359f3e.bc'.
    #0 0x0000000001cdac03 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x1cdac03)
    Enable execution on GPU and ACC #1 0x0000000001cd8a0e llvm::sys::RunSignalHandlers() (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x1cd8a0e)
    [SYCL] Using NoSignedWrap decoration without declaring SPV_KHR_no_integer_wrap_decoration #2 0x0000000001cdb0cf SignalHandler(int) Signals.cpp:0:0
    [SYCL] LLVM ERROR: OCL version mismatch while building test application #3 0x00007fc0f6e1e690 __restore_rt (/lib64/libpthread.so.0+0x13690)
    [SYCL] Support building on ARM #4 0x00007fc0f66c6c7b raise (/lib64/libc.so.6+0x41c7b)
    [SYCL] Support for cross-compilation? #5 0x00007fc0f66a7548 abort (/lib64/libc.so.6+0x22548)
    [SYCL] The build system doesn't use the OpenCL_LIBRARY passed in everywhere #6 0x00007fc0f66a742f _nl_load_domain.cold (/lib64/libc.so.6+0x2242f)
    Reformatted "Get started guide" to simplify the reading. #7 0x00007fc0f66b7fc2 (/lib64/libc.so.6+0x32fc2)
    [SYCL] building fails on Fedora/gcc #8 0x0000000000d176d8 (anonymous namespace)::LocalAccessorToSharedMemory::runOnModule(llvm::Module&) LocalAccessorToSharedMemory.cpp:0:0
    sycl: fix error building scheduler.cpp #9 0x00000000015289f1 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x15289f1)
    [SYCL] linking sycl gives missing __cpu_model #10 0x0000000001f6a5a6 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_deletellvm::raw_pwrite_stream >) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x1f6a5a6)
    Fix segfault with incompatible OpenCL #11 0x0000000002bcad76 clang::CodeGenAction::ExecuteAction() (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x2bcad76)
    [SYCL] segfault in clCreateProgramWithIL with OpenCL 2.0 #12 0x00000000025ab020 clang::FrontendAction::Execute() (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x25ab020)
    [SYCL] Fix some typos #13 0x000000000251c29a clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x251c29a)
    [SYCL][NFC] Fix wrong end-of-line encoding #14 0x0000000002663414 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0x2663414)
    [SYCL] Cannot compile some standard library functions in SYCL mode #15 0x0000000000a0fb6f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0xa0fb6f)
    [SYCL][NFC] id.hpp & range.hpp operator tidyup #16 0x0000000000a0d572 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
    [SYCL] build spir-v doesn't validate #17 0x0000000000a0d285 main (/home/joel/sycl_workspace/llvm/build/bin/clang-13+0xa0d285)
    LLVM pulldown #18 0x00007fc0f66a8e6b __libc_start_main (/lib64/libc.so.6+0x23e6b)
    Pull down from intel branch #19 0x0000000000a0a21a _start /tmp/glibc-2.30/csu/../sysdeps/x86_64/start.S:122:0
    clang-13: error: unable to execute command: Aborted
    clang-13: error: clang frontend command failed due to signal (use -v to see invocation)
    clang version 13.0.0 (https://github.com/intel/llvm a323be4)
    Target: x86_64-unknown-linux-gnu
    Thread model: posix
    InstalledDir: /home/joel/sycl_workspace/llvm/build/bin
    clang-13: note: diagnostic msg: Error generating preprocessed source(s).

Here is the dump from the inxi -Fxxxz command:

System: Kernel: 5.9.12 x86_64 bits: 64 compiler: gcc v: 2.33.1-slack15) Desktop: Xfce 4.12.5 tk: Gtk 2.24.32
info: xfce4-panel wm: xfwm4 dm: startx Distro: Slackware 14.2
Machine: Type: Laptop System: ASUSTeK product: ROG Zephyrus G15 GA502IV_GA502IV v: 1.0 serial:
Mobo: ASUSTeK model: GA502IV v: 1.0 serial: UEFI: American Megatrends v: GA502IV.205 date: 05/25/2020

CPU: Info: 8-Core model: AMD Ryzen 7 4800HS with Radeon Graphics bits: 64 type: MT MCP arch: Zen rev: 1

Graphics: Device-1: NVIDIA TU106M [GeForce RTX 2060 Max-Q] vendor: ASUSTeK driver: nvidia v: 455.45.01 bus ID: 01:00.0
chip ID: 10de:1f12
Device-2: Advanced Micro Devices [AMD/ATI] Renoir vendor: ASUSTeK driver: amdgpu v: kernel bus ID: 05:00.0
chip ID: 1002:1636
Display: server: X.Org 1.20.9 driver: amdgpu resolution: 1920x1080~240Hz s-dpi: 96
OpenGL: renderer: AMD RENOIR (DRM 3.39.0 5.9.12 LLVM 11.0.0) v: 4.6 Mesa 20.2.1 direct render: Yes

This example compiles and runs correctly using the ONEAPI dpcpp compiler from Intel.

@jomoga jomoga added the bug Something isn't working label Mar 11, 2021
@bader bader added the cuda CUDA back-end label Mar 12, 2021
@rNoz
Copy link

rNoz commented Mar 12, 2021

I have the same error in another code (custom) that works correctly with the dpcpp from oneAPI/intel. First I thought about USM, since the error seems related with LocalAccessorToSharedMemory, but the error persists after I removed everything involving USM (just using plain SYCL Buffers).

In my case I am using ubuntu 18, CUDA 11.1 and the simple-sycl-app.cpp works correctly for the NVIDIA device (compute capability 5.0). I built this repo (intel/llvm) just a few days ago.

It fails when linking, in this case the library (libapp.so) to use SYCL. The program is a SAXPY operation (saxpy.cpp) that uses the libapp.so. I tried compiling everything as a single unit (without libraries) and it builds and links. Why this behavior?

export DPCPP_HOME=~/sycl_workspace
export PATH=$DPCPP_HOME/llvm/build/bin:$PATH
export LD_LIBRARY_PATH=$DPCPP_HOME/llvm/build/lib:$LD_LIBRARY_PATH
cd build;
cmake ..
...
[ 55%] Building CXX object src/CMakeFiles/app.dir/scheduler.cpp.o
cd /home/ubuntu/application/app/build/src && clang++ -Dapp_EXPORTS  -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -fsycl-unnamed-lambda -fPIC -o CMakeFiles/app.dir/r.cpp.o -c /home/ubuntu/application/app/src/pl.cpp
clang-13: warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11010. Assuming the latest supported version 10.1 [-Wunknown-cuda-version]
[ 66%] Linking CXX shared library libapp.so
cd /home/ubuntu/application/app/build/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/app.dir/link.txt --verbose=1
clang++ -fPIC -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -fsycl-unnamed-lambda -shared -Wl,-soname,libapp.so -o libapp.so CMakeFiles/app.dir/utils.cpp.o CMakeFiles/app.dir/director.cpp.o CMakeFiles/app.dir/unit.cpp.o CMakeFiles/app.dir/r.cpp.o CMakeFiles/app.dir/pl.cpp.o
clang-13: warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11010. Assuming the latest supported version 10.1 [-Wunknown-cuda-version]
clang-13: /home/ubuntu/sycl_workspace/llvm/llvm/lib/Target/NVPTX/SYCL/LocalAccessorToSharedMemory.cpp:51: virtual bool {anonymous}::LocalAccessorToSharedMemory::runOnModule(llvm::Module&): Assertion `NvvmMetadata && "IR compiled to PTX must have nvvm.annotations"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/ubuntu/sycl_workspace/llvm/build/bin/clang-13 -cc1 -triple nvptx64-nvidia-cuda-sycldevice -fsycl -fsycl-is-device -fdeclare-spirv-builtins -aux-triple x86_64-unknown-linux-gnu -Wno-sycl-strict -sycl-std=2020 -S -disable-free -main-file-name utils.cpp.o -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem /home/ubuntu/sycl_workspace/llvm/build/bin/../include/sycl -mlink-builtin-bitcode /home/ubuntu/sycl_workspace/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc -mlink-builtin-bitcode /usr/local/cuda/nvvm/libdevice/libdevice.10.bc -target-feature +ptx71 -target-sdk-version=11.1 -target-cpu sm_50 -fno-split-dwarf-inlining -debugger-tuning=gdb -resource-dir /home/ubuntu/sycl_workspace/llvm/build/lib/clang/13.0.0 -fno-dwarf-directory-asm -fdebug-compilation-dir /home/ubuntu/application/app/build/src -ferror-limit 19 -fgnuc-version=4.2.1 -fcolor-diagnostics -fsycl-unnamed-lambda -o /tmp/utils-f85496.s -x ir /tmp/utils-601fc4.bc
1.      Code generation
2.      Running pass 'localaccessortosharedmemory' on module '/tmp/utils-601fc4.bc'.
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEi+0x2c)[0x55b5ec0f45ac]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN4llvm3sys17RunSignalHandlersEv+0x34)[0x55b5ec0f2274]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(+0x20f63e3)[0x55b5ec0f23e3]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f6864af1980]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f68639ccfb7]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f68639ce921]
/lib/x86_64-linux-gnu/libc.so.6(+0x3048a)[0x7f68639be48a]
/lib/x86_64-linux-gnu/libc.so.6(+0x30502)[0x7f68639be502]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(+0x107636d)[0x55b5eb07236d]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x376)[0x55b5eb904a16]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(+0x23d62ef)[0x55b5ec3d22ef]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN5clang17EmitBackendOutputERNS_17DiagnosticsEngineERKNS_19HeaderSearchOptionsERKNS_14CodeGenOptionsERKNS_13TargetOptionsERKNS_11LangOptionsERKN4llvm10DataLayoutEPNSE_6ModuleENS_13BackendActionESt10unique_ptrINSE_17raw_pwrite_streamESt14default_deleteISM_EE+0x5e5)[0x55b5ec3d3f65]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN5clang13CodeGenAction13ExecuteActionEv+0xac8)[0x55b5ed0395e8]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN5clang14FrontendAction7ExecuteEv+0xe1)[0x55b5ec9fa021]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN5clang16CompilerInstance13ExecuteActionERNS_14FrontendActionE+0x301)[0x55b5ec98e011]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_ZN5clang25ExecuteCompilerInvocationEPNS_16CompilerInstanceE+0xb0a)[0x55b5ecac5eda]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_Z8cc1_mainN4llvm8ArrayRefIPKcEES2_Pv+0x123c)[0x55b5ead285cc]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(+0xd27a89)[0x55b5ead23a89]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(main+0x96c)[0x55b5eac931ec]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f68639afbf7]
/home/ubuntu/sycl_workspace/llvm/build/bin/clang-13(_start+0x2a)[0x55b5ead235ea]
clang-13: error: unable to execute command: Aborted (core dumped)
clang-13: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 13.0.0 (https://github.com/intel/llvm 34b5f42daa0b4dbe96354003f8d8ec0fc3221f4e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/ubuntu/sycl_workspace/llvm/build/bin
clang-13: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.
src/CMakeFiles/app.dir/build.make:165: recipe for target 'src/libapp.so' failed
make[3]: *** [src/libapp.so] Error 254
make[3]: Leaving directory '/home/ubuntu/application/app/build'
CMakeFiles/Makefile2:144: recipe for target 'src/CMakeFiles/app.dir/all' failed
make[2]: *** [src/CMakeFiles/app.dir/all] Error 2
make[2]: Leaving directory '/home/ubuntu/application/app/build'
Makefile:105: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory '/home/ubuntu/application/app/build'
Makefile:4: recipe for target 'build' failed
make: *** [build] Error 2

@rNoz
Copy link

rNoz commented Mar 12, 2021

I am attempting to compile the example in Fig 2.7 from the DPC++ book by Reinders et al. Here is the code:

#include <CL/sycl.hpp>
#include
std::cout << "Selected device: " <<
Q.get_device().get_infoinfo::device::name() << "\n";

If I do std::cout << "Selected device: " << Q.get_device().get_info<sycl::info::device::name>() << "\n"; it works.

@jomoga
Copy link
Author

jomoga commented Mar 12, 2021

If I add a submit statement to the code it compiles and runs on the intel llvm branch. Here is the modified code:

#include <CL/sycl.hpp>
#include <iostream>
using namespace sycl;

int main() {
   // Create queue on whatever default device that the implementation
   // chooses. Implicit use of the default_selector.
   queue Q;

   std::cout << "Selected device: " <<
     Q.get_device().get_info<sycl::info::device::name>() << "\n";

     // Add this statement to get it to compile/run
     Q.submit([&](handler& h) {h.parallel_for(1 , [=](auto&idx){});  });

   return 0;
}

@zjin-lcf
Copy link
Contributor

Could you please post the correct code in the first place ? I couldn't reproduce your error for the following code.

The output is

Selected device: SYCL host device
 -> Device vendor:
#include <CL/sycl.hpp>
#include <iostream>
using namespace sycl;
int main() {
// Create queue to use the host device explicitly
queue Q{ host_selector{} };
std::cout << "Selected device: " <<
Q.get_device().get_info<info::device::name>() << "\n";
std::cout << " -> Device vendor: " <<
Q.get_device().get_info<info::device::vendor>() << "\n";
return 0;
}

@jomoga
Copy link
Author

jomoga commented Mar 18, 2021

If the error can't be reproduced then it's probably due to some subtle difference in the build environment. I don't have the skill to debug the clang++ compiler directly but fortunately I have a workaround as noted above. In addition almost all SYCL routines
have a submit statement so this isn't really a practical problem.

@zjin-lcf
Copy link
Contributor

@bader
A user may expect that Device vendor is non-empty. Is that right ?

Selected device: SYCL host device
 -> Device vendor:

@bader
Copy link
Contributor

bader commented Mar 19, 2021

I think SYCL specification leaves it to the implementation to decide.
@zjin-lcf, what do you think should be returned for the host device?

@zjin-lcf
Copy link
Contributor

Is 'Device vendor' not applicable or not available for a host selector ?
Is 'Device vendor' the same as the one when choosing a cpu selector ?

@jomoga
Copy link
Author

jomoga commented Mar 20, 2021

The problem is not that it doesn't return the device vendor. The problem is that it doesn't compile.

@steffenlarsen
Copy link
Contributor

I have seen this problem before on some systems an not others. Seems to me like the CUDA version has something to do with it, but I haven't had time to investigate any further.

The problem appears to be that some modules are missing the "nvvm.annotations" metadata. The work around I've used has been to replace https://github.com/intel/llvm/blob/sycl/llvm/lib/Target/NVPTX/SYCL/LocalAccessorToSharedMemory.cpp#L51 as

-     assert(NvvmMetadata && "IR compiled to PTX must have nvvm.annotations");
+     if (!NvvmMetadata) return false;

I am reluctant to push this as a solution until further investigation, but as far as I have used it as a work-around I have not encountered any problems.

@jomoga
Copy link
Author

jomoga commented Apr 6, 2021

Thanks for looking into this. Workarounds look like the best approach for now.

@steffenlarsen
Copy link
Contributor

Looks like #3535 fixed this.

@jomoga
Copy link
Author

jomoga commented Apr 20, 2021

Yes, this fixes the problem. The Fig 2.7 code in the original post now compiles and runs correctly. Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda CUDA back-end
Projects
None yet
Development

No branches or pull requests

5 participants