Skip to content

Clang crashes since last pulldowns when building shared objects #4294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Michoumichmich opened this issue Aug 9, 2021 · 9 comments · Fixed by #4409
Closed

Clang crashes since last pulldowns when building shared objects #4294

Michoumichmich opened this issue Aug 9, 2021 · 9 comments · Fixed by #4409
Labels
bug Something isn't working compiler Compiler related issue cuda CUDA back-end

Comments

@Michoumichmich
Copy link
Contributor

Michoumichmich commented Aug 9, 2021

Describe the bug
Since last week I'm experiencing crashes when building SYCL projects. All the crashes happens when linking the SYCL back-end of that library into a shared object.

When targeting the CUDA back-end, I get the following crashes:
I get, when building with -Og:

While deleting: %"class._ZTSN2cl4sycl5rangeILi3EEE.cl::sycl::range" ()* %
Use still stuck around after Def is destroyed:i8* bitcast (%"class._ZTSN2cl4sycl5rangeILi3EEE.cl::sycl::range" ()* <badref> to i8*)
Use still stuck around after Def is destroyed:  %call = call %"class._ZTSN2cl4sycl5rangeILi3EEE.cl::sycl::range" <badref>() #25, !dbg !58
clang-14: /home/michel/sycl_workspace/llvm/llvm/lib/IR/Value.cpp:103: llvm::Value::~Value(): Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/michel/sycl_workspace/deploy/bin/clang-14 -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -sycl-std=2020 -fsycl-std-layout-kernel-params -S -disable-free -main-file-name zfp.c.o -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -ffp-contract=on -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem /home/michel/sycl_workspace/deploy/bin/../include/sycl -internal-isystem /home/michel/sycl_workspace/deploy/bin/../include -mlink-builtin-bitcode /home/michel/sycl_workspace/deploy/lib/clang/14.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc -mlink-builtin-bitcode /usr/local/cuda-11.4/nvvm/libdevice/libdevice.10.bc -target-feature +ptx72 -target-sdk-version=11.2 -target-cpu sm_75 -mllvm -treat-scalable-fixed-error-as-warning -debug-info-kind=line-directives-only -dwarf-version=2 -debugger-tuning=gdb -fno-dwarf-directory-asm -v -resource-dir /home/michel/sycl_workspace/deploy/lib/clang/14.0.0 -Og -Wall -Wextra -Wshadow -Wdouble-promotion -Winit-self -Wuninitialized -Wmissing-declarations -Woverloaded-virtual -Wall -Wextra -Wcast-align -Wcast-qual -Wctor-dtor-privacy -Wdisabled-optimization -Wformat=2 -Winit-self -Wmissing-declarations -Wmissing-include-dirs -Woverloaded-virtual -Wredundant-decls -Wshadow -Wsign-conversion -Wsign-promo -Wstrict-overflow=5 -Wno-c++20-extensions -Wno-undef -Wno-unused -Wno-unused-parameter -Wno-unknown-cuda-version -pedantic -std=c++20 -fdebug-compilation-dir=/tmp/tmp.QgqHvAUwXa/cmake-build-debug/src -ferror-limit 19 -fgnuc-version=4.2.1 -fno-implicit-modules -fcolor-diagnostics -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/zfp-f738b8-d50538.s -x ir /tmp/zfp-350792.bc
1.	Code generation
2.	Running pass 'Add implicit SYCL global offset' on module '/tmp/zfp-350792.bc'.

And when building for release with -O2, the previous error disappear and I get this one instead:

clang-14: /home/michel/sycl_workspace/llvm/llvm/lib/Target/NVPTX/SYCL/LocalAccessorToSharedMemory.cpp:206: llvm::Function *(anonymous namespace)::LocalAccessorToSharedMemory::ProcessFunction(llvm::Module &, llvm::Function *): Assertion `F->use_empty()' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/michel/sycl_workspace/deploy/bin/clang-14 -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -sycl-std=2020 -fsycl-std-layout-kernel-params -S -disable-free -main-file-name zfp.c.o -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -ffp-contract=on -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem /home/michel/sycl_workspace/deploy/bin/../include/sycl -internal-isystem /home/michel/sycl_workspace/deploy/bin/../include -mlink-builtin-bitcode /home/michel/sycl_workspace/deploy/lib/clang/14.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc -mlink-builtin-bitcode /usr/local/cuda-11.4/nvvm/libdevice/libdevice.10.bc -target-feature +ptx72 -target-sdk-version=11.2 -target-cpu sm_75 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fno-dwarf-directory-asm -resource-dir /home/michel/sycl_workspace/deploy/lib/clang/14.0.0 -O3 -Wno-unknown-cuda-version -std=c++20 -fdebug-compilation-dir=/tmp/tmp.QgqHvAUwXa/cmake-build-release/src -ferror-limit 19 -fgnuc-version=4.2.1 -fno-implicit-modules -fcolor-diagnostics -vectorize-loops -vectorize-slp -o /tmp/zfp-f9854a-16efb3.s -x ir /tmp/zfp-4c499c.bc
1.	Code generation
2.	Running pass 'localaccessortosharedmemory' on module '/tmp/zfp-4c499c.bc'.

On the SPIR-V back-end (spir64_x86_64), the use of a right funnel shifter:

inline uint32_t funnelshift_r(uint32_t lo, uint32_t hi, uint32_t shift) {
        if (shift == 0) return lo; // To avoid shifting by 32
        return (lo >> shift % 31) | (hi << (32 - (shift % 31)));
}

Results now in the following error:

InvalidFunctionCall: Unexpected llvm intrinsic:
 llvm.fshr.i32

Replacing the shifter implementation by return 0 allows the code to compile on the spir-v back-end, and without any of the previously mentioned errors. But the behaviour is obviously wrong.

Everything compiles fine with older versions such as 8b56cbb on all the targets.

** Reproducer **
I wasn't able to write a reproducer. I'm compiling with -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_75. I can link the source code with build instructions if needed.

Environment:

  • OS: Linux RHEL 8.4
  • Target device and vendor: Intel CPU and Nvidia CUDA
  • DPC++ version: 69e78f8
@Michoumichmich Michoumichmich added the bug Something isn't working label Aug 9, 2021
@bader bader added the cuda CUDA back-end label Aug 10, 2021
@AerialMantis AerialMantis added the compiler Compiler related issue label Aug 23, 2021
@AidanBeltonS
Copy link
Contributor

I have been looking at the issue:

To Replicate:

Setup DPC++

zfp build and run instructions:

Commands to build and run:
$ mkdir build
$ cd build
$ CXX=clang++ cmake .. -DZFP_WITH_SYCL=ON -DBUILD_TESTING=OFF
$ cmake --build . --config Release

Error:
PR #3735 appears to be introducing the error.
It occurs when linking libraries, the same code can be compiled fine directly.

The kernel throwing the error is in src/sycl_zfp/variable.hpp:

auto kernel = [=]<size_t tile_size, size_t num_tiles>() mutable {

            using kernel_name = chunk_process_launch_kernel<tile_size, num_tiles>;

            uint max_blocks = 0;
            size_t shmem_count = (2 * num_tiles * maxpad32 + 2);
            cudaOccupancyMaxActiveBlocksPerMultiprocessor(&max_blocks, concat_bitstreams_chunk<tile_size, num_tiles>, tile_size * num_tiles, shmem_count);
            max_blocks *= num_sm;
            max_blocks = std::min(nstream_chunk, max_blocks);
            sycl::range<2> threads(num_tiles, tile_size);
            sycl::range<2> grid_dim(1, max_blocks);
            sycl::nd_range<2> kernel_parameters(threads * grid_dim, threads);

            auto barrier = nd_range_barrier<2>::make_barrier(q, kernel_parameters);

            q.submit([&](sycl::handler &cgh) {
                sycl::accessor<uint, 1, sycl::access::mode::read_write, sycl::target::local> sm_in(shmem_count, cgh);
                sycl::accessor<uint, 1, sycl::access::mode::read_write, sycl::target::local> sm_length(num_tiles, cgh);
                cgh.parallel_for<kernel_name>(kernel_parameters, [=](sycl::nd_item<2> it) {
                    auto sm_in_ptr = sm_in.get_pointer();
                    auto sm_length_ptr = sm_length.get_pointer();
                    concat_bitstreams_chunk<tile_size, num_tiles>(it, barrier, streams, chunk_offsets, first, nstream_chunk, last_chunk, nbitsmax, maxpad32, sm_in_ptr, sm_length_ptr);
                });
            }).wait();
        };

Failing build command (Simplified for readability)

/home/aidanbelton/llvm/build/bin/clang++ -fPIC -O3 -DNDEBUG -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Wno-unknown-cuda-version -Xsycl-target-backend=nvptx64-nvidia-cuda --cuda-gpu-arch=sm_75 -DSYCL_USE_NATIVE_FP_ATOMICS -sycl-std=2020 -std=c++20 -fsycl-unnamed-lambda -shared -Wl,-soname,libzfp.so.0 -o ../lib/libzfp.so.0.5.5 CMakeFiles/zfp.dir/zfp.c.o CMakeFiles/zfp.dir/bitstream.c.o ../lib/libsyclZFP_static_lib.a -lm /usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so -lpthread

Using -### narrowed down problem to broken llvm-ir code produced after the sycl-post-link pass

Broken passes:

/home/aidanbelton/llvm/build/bin/sycl-post-link" "-split=auto" "-emit-program-metadata" "-symbols" "-split-esimd" "-lower-esimd" "-O3" "-spec-const=default" "-o" "/tmp/zfp-596861.table" "/tmp/zfp-4bd185.bc"
 
"/home/aidanbelton/llvm/build/bin/file-table-tform" "-copy_single_file=Code,0" "-o" "/tmp/zfp-0bd324.bc" "/tmp/zfp-596861.table"  // Generates .bc file this is

 "/home/aidanbelton/llvm/build/bin/clang-13" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-fsycl-is-device" "-fdeclare-spirv-builtins" "-sycl-std=2020" "-fsycl-std-layout-kernel-params" "-S" "-disable-free" "-main-file-name" "zfp.c.o" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include/sycl" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include" "-mlink-builtin-bitcode" "/home/aidanbelton/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc" "-mlink-builtin-bitcode" "/usr/local/cuda/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx65" "-target-sdk-version=10.2" "-target-cpu" "sm_75" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/aidanbelton/llvm/build/lib/clang/13.0.0" "-O3" "-Wno-unknown-cuda-version" "-std=c++20" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir=/home/aidanbelton/zfp/build/src" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-vectorize-loops" "-vectorize-slp" "-o" "/tmp/zfp-83dc4c.s" "-x" "ir" "/tmp/zfp-0bd324.bc"

Working passes:

"/home/aidanbelton/llvm/build/bin/sycl-post-link" "-split=auto" "-ir-output-only" "-O3" "-spec-const=default" "-o" "/tmp/zfp-d44ed4.bc" "/tmp/zfp-2f7c8b.bc"

 "/home/aidanbelton/llvm/build/bin/clang-13" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-fsycl-is-device" "-fdeclare-spirv-builtins" "-sycl-std=2020" "-fsycl-std-layout-kernel-params" "-S" "-disable-free" "-main-file-name" "zfp.c.o" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include/sycl" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include" "-mlink-builtin-bitcode" "/home/aidanbelton/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc" "-mlink-builtin-bitcode" "/usr/local/cuda/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx65" "-target-sdk-version=10.2" "-target-cpu" "sm_75" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/aidanbelton/llvm/build/lib/clang/13.0.0" "-O3" "-Wno-unknown-cuda-version" "-std=c++20" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir=/home/aidanbelton/zfp/build/src" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-o" "/tmp/zfp-38a39b.s" "-x" "ir" "/tmp/zfp-d44ed4.bc"

From looking at the .bc files it appears that llvm.used is not being removed during the sycl-post-link operation when 'ir-output-only' is not passed.
To transport program metadata sycl-post-link needs to output a table linking to the code file, so ir-output-only is not appropriate.
Although sycl-post-link is removing llvm.used the table is linking to the original .bc input file. This can be solved by updating the code with the removed llvm.used.
A fix will be proposed.

@Michoumichmich
Copy link
Contributor Author

That's awesome @AidanBeltonS, thanks a lot!

@AidanBeltonS
Copy link
Contributor

Proposed fix: #4409

@bader
Copy link
Contributor

bader commented Aug 26, 2021

@AidanBeltonS, thanks a lot for the detailed analysis.

PR #3735 appears to be introducing the error.

@steffenlarsen, FYI.

@steffenlarsen
Copy link
Contributor

@AidanBeltonS Good find! I'm not fully convinced that the fix should be in sycl-post-link rather than the driver, but one thing gave me pause; how old is the DPC++ build you're using? The need for -copy_single_file=Code,0 was superseded by #4107. It should not be used by the driver anymore.

Also, would you mind adding the full output of -###, just so I have the full context?

@steffenlarsen
Copy link
Contributor

Also ping @Naghasan just in case.

@Michoumichmich
Copy link
Contributor Author

Michoumichmich commented Aug 26, 2021

I tested the fix, ZFP builds fine and works now! Thanks again. I'm not sure about the funnel shifter though

@AidanBeltonS
Copy link
Contributor

AidanBeltonS commented Aug 26, 2021

@AidanBeltonS Good find! I'm not fully convinced that the fix should be in sycl-post-link rather than the driver, but one thing gave me pause; how old is the DPC++ build you're using? The need for -copy_single_file=Code,0 was superseded by #4107. It should not be used by the driver anymore.

Also, would you mind adding the full output of -###, just so I have the full context?

My rational for the fix being in sycl-post-link is that depending on the ir-output-only flag outputs two different versions of the code one processed, and one original.
That being said I have no objections to the fix being in the driver.

The output of -### in the above text is with PR #3735 and without PR #4107, this was to simplify the debugging process.
I have tested the fix with the CUDA module splitting PR and it works, so it has been tested on an up to date build.

I have added the three -### for with PR #4107, PR #3735, and before PR #3735

With 4107
clang version 14.0.0 (https://github.com/AidanBeltonS/llvm 510cac8d27da5f2748a9092e5b12af14b5504fa3)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/aidanbelton/llvm/build/bin
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=o" "-targets=sycl-spir64-unknown-unknown-sycldevice" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-check-section" 
"clang-offload-bundler" "-type=o" "-targets=sycl-spir64-unknown-unknown-sycldevice" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-spir64-unknown-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr_emu-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-75f872.o,/tmp/zfp-2773d3.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-12494d.o,/tmp/bitstream-f50e69.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-f66894.o,/tmp/zfp-10a081.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-2a9ae6.o,/tmp/bitstream-df818d.o" "-unbundle" "-allow-missing-bundles"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "/tmp/-f72f58.out" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-f66894.o" "/tmp/bitstream-2a9ae6.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"
"/home/aidanbelton/llvm/build/bin/clang-offload-deps" "-targets=sycl-nvptx64-nvidia-cuda" "-outputs=/tmp/-84ffd8.bc" "/tmp/-f72f58.out"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=a" "-targets=sycl-nvptx64-nvidia-cuda" "-inputs=../lib/libsyclZFP_static_lib.a" "-outputs=/tmp/libsyclZFP_static_lib-3e8de1.a" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/llvm-link" "/tmp/zfp-2773d3.o" "/tmp/bitstream-f50e69.o" "/tmp/-84ffd8.bc" "-o" "/tmp/nvptx64-link-e88790.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/llvm-link" "--only-needed" "/tmp/nvptx64-link-e88790.bc" "/tmp/libsyclZFP_static_lib-3e8de1.a" "-o" "/tmp/zfp-f1b8b9.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/sycl-post-link" "-split=auto" "-emit-program-metadata" "-symbols" "-split-esimd" "-lower-esimd" "-O3" "-spec-const=default" "-o" "/tmp/zfp-48867d.bc" "/tmp/zfp-f1b8b9.bc"
"/home/aidanbelton/llvm/build/bin/file-table-tform" "-extract=Code" "-drop_titles" "-o" "/tmp/zfp-f02eff.bc" "/tmp/zfp-48867d.bc"
"/home/aidanbelton/llvm/build/bin/llvm-foreach" "--out-ext=s" "--in-file-list=/tmp/zfp-f02eff.bc" "--in-replace=/tmp/zfp-f02eff.bc" "--out-file-list=/tmp/zfp-62de0b.s" "--out-replace=/tmp/zfp-62de0b.s" "--" "/home/aidanbelton/llvm/build/bin/clang-14" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-fsycl-is-device" "-fdeclare-spirv-builtins" "-sycl-std=2020" "-fsycl-std-layout-kernel-params" "-S" "-disable-free" "-main-file-name" "zfp.c.o" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include/sycl" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include" "-mlink-builtin-bitcode" "/home/aidanbelton/llvm/build/lib/clang/14.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc" "-mlink-builtin-bitcode" "/usr/local/cuda/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx65" "-target-sdk-version=10.2" "-target-cpu" "sm_75" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-fno-dwarf-directory-asm" "-resource-dir" "/home/aidanbelton/llvm/build/lib/clang/14.0.0" "-O3" "-Wno-unknown-cuda-version" "-std=c++20" "-fdebug-compilation-dir=/home/aidanbelton/zfp/build/src" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-o" "/tmp/zfp-62de0b.s" "-x" "ir" "/tmp/zfp-f02eff.bc"
"/home/aidanbelton/llvm/build/bin/llvm-foreach" "--out-ext=o" "--in-file-list=/tmp/zfp-62de0b.s" "--in-replace=/tmp/zfp-62de0b.s" "--out-file-list=/tmp/zfp-ffcb77.o" "--out-replace=/tmp/zfp-ffcb77.o" "--" "/usr/local/cuda/bin/ptxas" "-m64" "-O3" "--gpu-name" "sm_75" "--output-file" "/tmp/zfp-ffcb77.o" "/tmp/zfp-62de0b.s"
"/home/aidanbelton/llvm/build/bin/llvm-foreach" "--out-ext=fatbin" "--in-file-list=/tmp/zfp-62de0b.s" "--in-replace=/tmp/zfp-62de0b.s" "--in-file-list=/tmp/zfp-ffcb77.o" "--in-replace=/tmp/zfp-ffcb77.o" "--out-file-list=/tmp/zfp-45fdc9.fatbin" "--out-replace=/tmp/zfp-45fdc9.fatbin" "--" "/usr/local/cuda/bin/fatbinary" "-64" "--create" "/tmp/zfp-45fdc9.fatbin" "--image=profile=compute_75,file=/tmp/zfp-62de0b.s" "--image=profile=sm_75,file=/tmp/zfp-ffcb77.o"
"/home/aidanbelton/llvm/build/bin/file-table-tform" "-replace=Code,Code" "-o" "/tmp/zfp-d01579.table" "/tmp/zfp-48867d.bc" "/tmp/zfp-45fdc9.fatbin"
"/home/aidanbelton/llvm/build/bin/clang-offload-wrapper" "-o=/tmp/wrapper-4aa18a.bc" "-host=x86_64-unknown-linux-gnu" "-compile-opts=--cuda-gpu-arch=sm_75" "-target=nvptx64" "-kind=sycl" "-batch" "/tmp/zfp-d01579.table"
"/home/aidanbelton/llvm/build/bin/llc" "-filetype=obj" "-o" "/tmp/libzfp-369548.o" "/tmp/wrapper-4aa18a.bc" "-relocation-model=pic"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "../lib/libzfp.so.0.5.5" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-f66894.o" "/tmp/bitstream-2a9ae6.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "/tmp/libzfp-369548.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"
With 3735
clang version 13.0.0 (https://github.com/AidanBeltonS/llvm a8fe4a5cea75fdb5efc5ef1faa347976e1e22ed8)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/aidanbelton/llvm/build/bin
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx_emu-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr_emu-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-88b22f.o,/tmp/zfp-39662b.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-89a704.o,/tmp/bitstream-d89d36.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-5a7759.o,/tmp/zfp-ad16f4.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-53dc9f.o,/tmp/bitstream-718288.o" "-unbundle" "-allow-missing-bundles"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "/tmp/-660f33.out" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-5a7759.o" "/tmp/bitstream-53dc9f.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"
"/home/aidanbelton/llvm/build/bin/clang-offload-deps" "-targets=sycl-nvptx64-nvidia-cuda" "-outputs=/tmp/-58edf3.bc" "/tmp/-660f33.out"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=a" "-targets=sycl-nvptx64-nvidia-cuda" "-inputs=../lib/libsyclZFP_static_lib.a" "-outputs=/tmp/libsyclZFP_static_lib-6918f0.a" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/llvm-link" "/tmp/zfp-39662b.o" "/tmp/bitstream-d89d36.o" "/tmp/-58edf3.bc" "-o" "/tmp/nvptx64-link-88e0a6.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/llvm-link" "--only-needed" "/tmp/nvptx64-link-88e0a6.bc" "/tmp/libsyclZFP_static_lib-6918f0.a" "-o" "/tmp/zfp-c5391c.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/sycl-post-link" "-split=auto" "-emit-program-metadata" "-symbols" "-split-esimd" "-lower-esimd" "-O3" "-spec-const=default" "-o" "/tmp/zfp-80c8ff.table" "/tmp/zfp-c5391c.bc"
"/home/aidanbelton/llvm/build/bin/file-table-tform" "-copy_single_file=Code,0" "-o" "/tmp/zfp-c57a26.bc" "/tmp/zfp-80c8ff.table"
"/home/aidanbelton/llvm/build/bin/clang-14" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-fsycl-is-device" "-fdeclare-spirv-builtins" "-sycl-std=2020" "-fsycl-std-layout-kernel-params" "-S" "-disable-free" "-main-file-name" "zfp.c.o" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include/sycl" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include" "-mlink-builtin-bitcode" "/home/aidanbelton/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc" "-mlink-builtin-bitcode" "/usr/local/cuda/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx65" "-target-sdk-version=10.2" "-target-cpu" "sm_75" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/aidanbelton/llvm/build/lib/clang/13.0.0" "-O3" "-Wno-unknown-cuda-version" "-std=c++20" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir=/home/aidanbelton/zfp/build/src" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-o" "/tmp/zfp-d80e0f.s" "-x" "ir" "/tmp/zfp-c57a26.bc"
"/usr/local/cuda/bin/ptxas" "-m64" "-O3" "--gpu-name" "sm_75" "--output-file" "/tmp/zfp-fc4255.o" "/tmp/zfp-d80e0f.s"
"/usr/local/cuda/bin/fatbinary" "-64" "--create" "/tmp/zfp-fdedd8.fatbin" "--image=profile=compute_75,file=/tmp/zfp-d80e0f.s" "--image=profile=sm_75,file=/tmp/zfp-fc4255.o"
"/home/aidanbelton/llvm/build/bin/file-table-tform" "-replace_cell=Code,0" "-o" "/tmp/zfp-3590e8.table" "/tmp/zfp-80c8ff.table" "/tmp/zfp-fdedd8.fatbin"
"/home/aidanbelton/llvm/build/bin/clang-offload-wrapper" "-o=/tmp/wrapper-d18d08.bc" "-host=x86_64-unknown-linux-gnu" "-compile-opts=--cuda-gpu-arch=sm_75" "-target=nvptx64" "-kind=sycl" "-batch" "/tmp/zfp-3590e8.table"
"/home/aidanbelton/llvm/build/bin/llc" "-filetype=obj" "-o" "/tmp/libzfp-fad74f.o" "/tmp/wrapper-d18d08.bc" "-relocation-model=pic"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "../lib/libzfp.so.0.5.5" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-5a7759.o" "/tmp/bitstream-53dc9f.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "/tmp/libzfp-fad74f.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"
Without 3735
clang version 13.0.0 (https://github.com/AidanBeltonS/llvm f7aa2bf3677e61c883e69fe0912eefeea0e1fc45)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/aidanbelton/llvm/build/bin
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocx_emu-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=sycl-fpga_aocr_emu-intel-unknown-sycldevice" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"clang-offload-bundler" "-type=ao" "-targets=host-x86_64-unknown-linux-gnu" "-inputs=../lib/libsyclZFP_static_lib.a" "-check-section" 
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-bf4d3b.o,/tmp/zfp-bbfa80.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-d3a413.o,/tmp/bitstream-f482b7.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/zfp.c.o" "-outputs=/tmp/zfp-e59399.o,/tmp/zfp-958d2b.o" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda" "-inputs=CMakeFiles/zfp.dir/bitstream.c.o" "-outputs=/tmp/bitstream-9e788f.o,/tmp/bitstream-8533a4.o" "-unbundle" "-allow-missing-bundles"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "/tmp/-698eb4.out" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-e59399.o" "/tmp/bitstream-9e788f.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"
"/home/aidanbelton/llvm/build/bin/clang-offload-deps" "-targets=sycl-nvptx64-nvidia-cuda" "-outputs=/tmp/-599fd3.bc" "/tmp/-698eb4.out"
"/home/aidanbelton/llvm/build/bin/clang-offload-bundler" "-type=a" "-targets=sycl-nvptx64-nvidia-cuda" "-inputs=../lib/libsyclZFP_static_lib.a" "-outputs=/tmp/libsyclZFP_static_lib-86803a.a" "-unbundle" "-allow-missing-bundles"
"/home/aidanbelton/llvm/build/bin/llvm-link" "/tmp/zfp-bbfa80.o" "/tmp/bitstream-f482b7.o" "/tmp/-599fd3.bc" "-o" "/tmp/nvptx64-link-2d6500.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/llvm-link" "--only-needed" "/tmp/nvptx64-link-2d6500.bc" "/tmp/libsyclZFP_static_lib-86803a.a" "-o" "/tmp/zfp-cc8350.bc" "--suppress-warnings"
"/home/aidanbelton/llvm/build/bin/sycl-post-link" "-split=auto" "-ir-output-only" "-O3" "-spec-const=default" "-o" "/tmp/zfp-a69244.bc" "/tmp/zfp-cc8350.bc"
"/home/aidanbelton/llvm/build/bin/clang-14" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-fsycl-is-device" "-fdeclare-spirv-builtins" "-sycl-std=2020" "-fsycl-std-layout-kernel-params" "-S" "-disable-free" "-main-file-name" "zfp.c.o" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include/sycl" "-internal-isystem" "/home/aidanbelton/llvm/build/bin/../include" "-mlink-builtin-bitcode" "/home/aidanbelton/llvm/build/lib/clang/13.0.0/../../clc/libspirv-nvptx64--nvidiacl.bc" "-mlink-builtin-bitcode" "/usr/local/cuda/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx65" "-target-sdk-version=10.2" "-target-cpu" "sm_75" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "/home/aidanbelton/llvm/build/lib/clang/13.0.0" "-O3" "-Wno-unknown-cuda-version" "-std=c++20" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir=/home/aidanbelton/zfp/build/src" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-o" "/tmp/zfp-6ad09a.s" "-x" "ir" "/tmp/zfp-a69244.bc"
"/usr/local/cuda/bin/ptxas" "-m64" "-O3" "--gpu-name" "sm_75" "--output-file" "/tmp/zfp-310b14.o" "/tmp/zfp-6ad09a.s"
"/usr/local/cuda/bin/fatbinary" "-64" "--create" "/tmp/zfp-5d0e05.fatbin" "--image=profile=compute_75,file=/tmp/zfp-6ad09a.s" "--image=profile=sm_75,file=/tmp/zfp-310b14.o"
"/home/aidanbelton/llvm/build/bin/clang-offload-wrapper" "-o=/tmp/wrapper-42ad5e.bc" "-host=x86_64-unknown-linux-gnu" "-compile-opts=--cuda-gpu-arch=sm_75" "-target=nvptx64" "-kind=sycl" "/tmp/zfp-5d0e05.fatbin"
"/home/aidanbelton/llvm/build/bin/llc" "-filetype=obj" "-o" "/tmp/libzfp-e33483.o" "/tmp/wrapper-42ad5e.bc" "-relocation-model=pic"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-shared" "-o" "../lib/libzfp.so.0.5.5" "/usr/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o" "-L/usr/lib/gcc/x86_64-linux-gnu/7.5.0" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/home/aidanbelton/llvm/build/bin/../lib" "-L/lib" "-L/usr/lib" "-soname" "libzfp.so.0" "/tmp/zfp-e59399.o" "/tmp/bitstream-9e788f.o" "../lib/libsyclZFP_static_lib.a" "-lm" "/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so" "-lpthread" "/tmp/libzfp-e33483.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lsycl" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o" "/usr/lib/x86_64-linux-gnu/crtn.o"

@steffenlarsen
Copy link
Contributor

My rational for the fix being in sycl-post-link is that depending on the ir-output-only flag outputs two different versions of the code one processes, and one the original.

Okay, I think I understand. Thank you for clarifying. I'll leave it up to those who know the sycl-post-link tool better than I do.

bader pushed a commit that referenced this issue Sep 2, 2021
When using `sycl-post-link` llvm.used should be removed. Otherwise it can cause a crash further on. 
This patch checks to see if llvm.used is removed and if it is prevents the old code from being reused in the table.

This fixes #4294. 
See issue for more details on problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compiler Compiler related issue cuda CUDA back-end
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants