Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL (CLC++2021) compilation to SPIR-V fails #2193

Open
davidrohr opened this issue Oct 23, 2023 · 14 comments
Open

OpenCL (CLC++2021) compilation to SPIR-V fails #2193

davidrohr opened this issue Oct 23, 2023 · 14 comments

Comments

@davidrohr
Copy link

I opened this issue as an LLVM issue first, but LLVM experts indicated the problem is with the SPIRV-LLVM-Translator:
llvm/llvm-project#68305
Thus I am opening it also here:

Compilation with clang 17.0.2 (on Gentoo Linux) fails with the below error message. I was using clang 15 before, which didn't fail. I also tried clang 16 now, which failed with the same error, I didn't try other versions.

I am attaching a tarball with my .cl file, and with the 2 files in /tmp that clang asked me to attach to the bug report:
bugreport.tar.gz

Command to reproduce:

clang-17 -O0 --target=spirv64 -ferror-limit=1000 -Dcl_clang_storage_class_specifiers -Wno-invalid-constexpr -Wno-unused-command-line-argument -cl-std=CLC++2021 -Xclang -fdenormal-fp-math-f32=ieee -cl-mad-enable -cl-no-signed-zeros -c foo.cl -o foo.spirv

Error message:

llvm-spirv: /usr/lib/llvm/17/include/llvm/ADT/SmallVector.h:294: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](size_type) [with T = llvm::Type*; <template-parameter-1-2> = void; reference = llvm::Type*&; size_type = long unsigned int]: Assertion `idx < size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /usr/lib/llvm/17/bin/llvm-spirv /tmp/foo-21cdb2.bc -o /home/qon/foo.spirv
 #0 0x00007f161a086fae llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc86fae)
 #1 0x00007f161a084c44 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc84c44)
 #2 0x00007f161a084db6 (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc84db6)
 #3 0x00007f1618e641b0 (/lib64/libc.so.6+0x391b0)
 #4 0x00007f1618eb208c (/lib64/libc.so.6+0x8708c)
 #5 0x00007f1618e64112 gsignal (/lib64/libc.so.6+0x39112)
 #6 0x00007f1618e4d4f2 abort (/lib64/libc.so.6+0x224f2)
 #7 0x00007f1618e4d415 (/lib64/libc.so.6+0x22415)
 #8 0x00007f1618e5cd32 (/lib64/libc.so.6+0x31d32)
 #9 0x00007f16213da2bf SPIRV::BuiltinCallMutator::doConversion() (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1da2bf)
#10 0x00007f16213a03ef (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a03ef)
#11 0x00007f16213a90fb SPIRV::OCLToSPIRVBase::transBuiltin(llvm::CallInst*, OCLUtil::OCLBuiltinTransInfo&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a90fb)
#12 0x00007f16213aa280 SPIRV::OCLToSPIRVBase::visitCallBuiltinSimple(llvm::CallInst*, llvm::StringRef, llvm::StringRef) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1aa280)
#13 0x00007f16213b189c SPIRV::OCLToSPIRVBase::visitCallInst(llvm::CallInst&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1b189c)
#14 0x00007f16213a0841 SPIRV::OCLToSPIRVBase::runOCLToSPIRV(llvm::Module&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a0841)
#15 0x00007f16213a0b6e SPIRV::OCLToSPIRVPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a0b6e)
#16 0x00007f16214b021d (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x2b021d)
#17 0x00007f16214f4f27 (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x2f4f27)
#18 0x000055b379753efd (/usr/lib/llvm/17/bin/llvm-spirv+0x13efd)
#19 0x000055b37974d0c2 (/usr/lib/llvm/17/bin/llvm-spirv+0xd0c2)
#20 0x00007f1618e4eb8a (/lib64/libc.so.6+0x23b8a)
#21 0x00007f1618e4ec45 __libc_start_main (/lib64/libc.so.6+0x23c45)
#22 0x000055b37974d471 (/usr/lib/llvm/17/bin/llvm-spirv+0xd471)
clang-17: error: unable to execute command: Aborted
clang-17: error: llvm-spirv command failed due to signal (use -v to see invocation)
clang version 17.0.2
Target: spirv64
Thread model: posix
InstalledDir: /usr/lib/llvm/17/bin
clang-17: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-17: note: diagnostic msg: /tmp/foo-f70c1d.cl
clang-17: note: diagnostic msg: /tmp/foo-f70c1d.sh
clang-17: note: diagnostic msg: 

********************
@davidrohr
Copy link
Author

For reference, I tried the same with clang 18.1 and spirv llvm transpator 18.1, and I am still getting the same error

@MrSidims
Copy link
Contributor

@davidrohr thanks for the report and apologies for the long response. I have a feeling, that this issue relates with (quite funny) bug with handling "convert" functions, that was recently fixed in #2443 . May I ask you to check it you example works on main branch? If so, @vmaksimo please backport your patch(es) to release branches.

@davidrohr
Copy link
Author

Dear @MrSidims : I have just tried with the version llvm_release_180 branch (commit 1745c78) with #2443 cherry-picked, and it still fails.

The testcase is attached. The error message is:

 #0 0x00007f6865fd17fe llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdd17fe)
 #1 0x00007f6865fcef34 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdcef34)
 #2 0x00007f6865fcf326 (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdcf326)
 #3 0x00007f6864c641b0 (/lib64/libc.so.6+0x391b0)
 #4 0x00007f6864cb208c (/lib64/libc.so.6+0x8708c)
 #5 0x00007f6864c64112 gsignal (/lib64/libc.so.6+0x39112)
 #6 0x00007f6864c4d4f2 abort (/lib64/libc.so.6+0x224f2)
 #7 0x00007f6864c4d415 (/lib64/libc.so.6+0x22415)
 #8 0x00007f6864c5cd32 (/lib64/libc.so.6+0x31d32)
 #9 0x00007f686d3f039f SPIRV::BuiltinCallMutator::doConversion() (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1f039f)
#10 0x00007f686d3a31ef (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a31ef)
#11 0x00007f686d3a487d SPIRV::OCLToSPIRVBase::transBuiltin(llvm::CallInst*, OCLUtil::OCLBuiltinTransInfo&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a487d)
#12 0x00007f686d3a537e SPIRV::OCLToSPIRVBase::visitCallBuiltinSimple(llvm::CallInst*, llvm::StringRef, llvm::StringRef) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a537e)
#13 0x00007f686d3c8d0a SPIRV::OCLToSPIRVBase::visitCallInst(llvm::CallInst&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1c8d0a)
#14 0x00007f686d3a3e78 SPIRV::OCLToSPIRVBase::runOCLToSPIRV(llvm::Module&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a3e78)
#15 0x00007f686d3a40ee SPIRV::OCLToSPIRVPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a40ee)
#16 0x00007f686d4d2a0d (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x2d2a0d)
#17 0x00007f686617495f llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xf7495f)
#18 0x00007f686d51415a (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x31415a)
#19 0x0000558f4efae602 (/usr/lib/llvm/18/bin/llvm-spirv+0x13602)
#20 0x0000558f4efa8c2c (/usr/lib/llvm/18/bin/llvm-spirv+0xdc2c)
#21 0x00007f6864c4eb8a (/lib64/libc.so.6+0x23b8a)
#22 0x00007f6864c4ec45 __libc_start_main (/lib64/libc.so.6+0x23c45)
#23 0x0000558f4efa93b1 (/usr/lib/llvm/18/bin/llvm-spirv+0xe3b1)
clang: error: unable to execute command: Aborted
clang: error: llvm-spirv command failed due to signal (use -v to see invocation)

testcase.tar.gz

@vmaksimo
Copy link
Contributor

vmaksimo commented Apr 4, 2024

@davidrohr could you please also try this fix together with the one mentioned above? #2464
If it doesn't help, I would ask you to attach *.bc input file for translator to have a chance to take a look at the bug closely. Thanks!

@davidrohr
Copy link
Author

@vmaksimo : I just tried with #2464 in addition. Still fails in the same way as before.

I am attachiing again the clang testcase with the sources:
testcase.tar.gz

And here is the .bc temporary file I obtain from clang using -emit-llvm --target=spir64-unknown-unknown instead of --target=spirv.
bctestcase.tar.gz

@davidrohr
Copy link
Author

@vmaksimo : Any progress on this? Sorry for bugging, but I am wondering what could be a timescale to get a fix?

@vmaksimo
Copy link
Contributor

Hi @davidrohr! I was able to reproduce the issue and found out that the problem is in the translation of printf call. Unfortunately, no more progress yet. Any chance you could try not to use printf calls in your app as a hotfix?
Timescale is ~3-4 weeks if no one else will take a look earlier (I'll be on vacation for nearest ~2 weeks).

@davidrohr
Copy link
Author

Thx a lot. That was very helpful. Indeed that was a bug on our side. The printf should not have been there in the GPU version. (Although I understand that it should be supported in principle, so would be good to fix it :)).

I fixed it on our side, but now I am running into a different problem, which I have reported here #2531 :(.

@davidrohr
Copy link
Author

For reference, after the fix mentioned in #2531 and avoiding the printf, the code compiles to spirv now. I am leaving this open until the printf problem is fixed as well.

@davidrohr
Copy link
Author

@vmaksimo : What is the status of this? I am not using printf any more, so it is not failing for me. But is it meanwhile fixed?

@MrSidims
Copy link
Contributor

MrSidims commented Nov 7, 2024

@davidrohr I've just checked the original reproducer with clang build from LLVM top of the trunk, and apparently it doesn't go to llvm-spirv and crashes earlier:
LLVM/llvm-project/clang/include/clang/Sema/Initialization.h:347: static clang::InitializedEntity clang::InitializedEntity::Initial
izeTemporary(clang::ASTContext&, clang::TypeSourceInfo*): Assertion `!Type.hasAddressSpace() && "Temporary already has address space!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.

I'm compiling foo.cl with
clang -O0 --target=spirv64 -ferror-limit=1000 -Dcl_clang_storage_class_specifiers -Wno-invalid-constexpr -Wno-unused-command-line-argument -cl-std=CLC++2021 -Xclang -fdenormal-fp-math-f32=ieee -cl-mad-enable -cl-no-signed-zeros -c foo.cl -o foo.spirv

The same error occurs with -emit-llvm --target=spir64-unknown-unknown option.

@davidrohr
Copy link
Author

well, the current version of our code compiles with LLVM 17, 18 and 19, and I don't intend to follow up possible new issues with our old code version.
If this old version here doesn't compile with current LLVM any more, and if thus one cannot reproduce the issue in the translator anymore, I would just close this issue.

@MrSidims
Copy link
Contributor

MrSidims commented Nov 7, 2024

I will try llvm19 tomorrow

@davidrohr
Copy link
Author

Hi @davidrohr! I was able to reproduce the issue and found out that the problem is in the translation of printf call.

I think, in principle one does not even need to try the original reproducer. The remaining issue was the translation of the printf call, other problems were already fixed. That would be a much simpler reproducer, in case my original one does no longer work with LLVM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants