Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking to a static library fails when multiple SYCL targets are used #5330

Closed
al42and opened this issue Jan 17, 2022 · 6 comments
Closed

Linking to a static library fails when multiple SYCL targets are used #5330

al42and opened this issue Jan 17, 2022 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@al42and
Copy link
Contributor

al42and commented Jan 17, 2022

Describe the bug

Linking to a static library containing SYCL kernels fails with a cryptic error when -fsycl-targets contains several architectures, e.g. -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64.

To Reproduce

Reproducer code archive

It builds a static library libfoo.a containing some SYCL code, and a simple main.cpp file which calls a function from this library. Assumes clang++ and llvm-ar point to IntelLLVM, built with CUDA support.

# If we specify one target when linking, everything works great:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda main.o libfoo.a -o main-cuda
$ make main-cuda && ./main-cuda && echo ok
ok

# Now, let's try to include two targets:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64 main.o libfoo.a -o main-cuda-spir
$ make main-cuda-spir
spirv-to-ir-wrapper: Input file '!<arch>' not found
llvm-foreach: 
spirv-to-ir-wrapper: Input file '/               0           0     0     0       220       `' not found
llvm-foreach: 
make: *** [Makefile:11: main-cuda-spir] Error 1

# Now let's use the same two targets in different order:
# clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda main.o libfoo.a -o main-spir-cuda
$ make main-spir-cuda
/home/aland/intel-sycl/llvm/build/install/bin/llvm-link: /tmp/libfoo-a6eb86.a:1:1: error: expected top-level entity
/tmp/libfoo-ff1c0c.o
^
/home/aland/intel-sycl/llvm/build/install/bin/llvm-link: error:  loading file '/tmp/libfoo-a6eb86.a'
clang-14: error: sycl-link command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:14: main-spir-cuda] Error 1

# Not specifying any target leads to no kernels being bundled (despite the static library having them):
# clang++ -fsycl main.o libfoo.a -o main-none
$ make main-none && ./main-none && echo ok
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)

Environment (please complete the following information):

  • OS: Ubuntu Linux 20.04
  • Target device and vendor: NVIDIA GTX1060SUPER (no GPU is actually needed for the problem to be observed, but that's what I have).
  • DPC++ version: clang version 14.0.0 (https://github.com/intel/llvm c878063)
  • Dependencies version: CUDA 11.5

Additional context

Problem appears to be introduced in c878063 (#5251).

@al42and al42and added the bug Something isn't working label Jan 17, 2022
@bader
Copy link
Contributor

bader commented Jan 18, 2022

Problem appears to be introduced in c878063 (#5251).

Assigning to @mdtoguchi (the author of #5251).

@al42and
Copy link
Contributor Author

al42and commented Jan 21, 2022

Behavior changed in clang version 14.0.0 (https://github.com/intel/llvm 02b7301). The bug is not fully resolved, but it seems it migrated to a different place:

# If we specify one target when linking, everything still works great:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda main.o libfoo.a -o main-cuda
$ make main-cuda && ./main-cuda && echo ok
ok

# If we do -fsycl-targets=nvptx64-nvidia-cuda,spir64, the code compiles but crashes at the run time:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64 main.o libfoo.a -o main-cuda-spir
$ make main-cuda-spir && ./main-cuda-spir && echo ok
main-cuda-spir: /home/aland/intel-sycl/llvm/sycl/source/detail/program_manager/program_manager.cpp:1069: void cl::sycl::detail::ProgramManager::addImages(pi_device_binaries): Assertion `KSIdMap[EntriesIt->name] == KSIdIt->second && "Kernel sets are not disjoint"' failed.
Aborted (core dumped)

# But if we change the target order, things work (at least on NVIDIA GPU)
# clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda main.o libfoo.a -o main-spir-cuda
$ make main-spir-cuda
ok

EDIT: The tested version, 02b7301, includes #5334 mentioned below.

EDIT2: Still the case with version 708a5bf

@mdtoguchi
Copy link
Contributor

Behaviors should be improved with #5334. After the merge of #5251, multiple targets involving static libraries had inconsistent unbundling behaviors when using -fsycl. #5334 improves this so the device link that consumes the unbundled archive expects and links in the list of objects as unbundled instead of passing in as an archive.

@AlexeySachkov
Copy link
Contributor

# If we do -fsycl-targets=nvptx64-nvidia-cuda,spir64, the code compiles but crashes at the run time:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64 main.o libfoo.a -o main-cuda-spir
$ make main-cuda-spir && ./main-cuda-spir && echo ok
main-cuda-spir: /home/aland/intel-sycl/llvm/sycl/source/detail/program_manager/program_manager.cpp:1069: void cl::sycl::detail::ProgramManager::addImages(pi_device_binaries): Assertion `KSIdMap[EntriesIt->name] == KSIdIt->second && "Kernel sets are not disjoint"' failed.
Aborted (core dumped)

This assertion indicates that you have encountered a known limitation of our SYCL runtime. Tagging @sergey-semenov here to provide more details about this limitation, whether there are some workarounds possible and whether there are plans to lift that limitation

@al42and
Copy link
Contributor Author

al42and commented Feb 11, 2022

As of version e211d73, the issue appears to be resolved.

@al42and
Copy link
Contributor Author

al42and commented Mar 23, 2022

Issue is still gone with ca9fea6, closing.

@al42and al42and closed this as completed Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants