Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression since 0efe111365 when building opencv with clang-cl 17.0.6 on Windows: clang-cl just hangs #69428

Closed
emmenlau opened this issue Oct 18, 2023 · 31 comments · Fixed by #79694
Labels
hang Compiler hang (infinite loop) llvm:ir

Comments

@emmenlau
Copy link

I have reported a build issue of opencv in opencv/opencv#24390. If needed I can clone the information here. Sadly, I am unable to provide a minimal reproducer, and building opencv is slightly more involved.

It would be great if somebody could still look at this, as opencv is quite a relevant library...

@DimitryAndric
Copy link
Collaborator

One of the things you can try is identifying one or more instances of clang-cl that hang, and attempt to make it crash, so it produces test case files (preprocessed .cpp and .sh). On Linux you would simply send a SIGABRT to the process, and this would cause the clang-cl driver to create test case files, but on Windows I am unsure.

@emmenlau
Copy link
Author

Thanks a lot @DimitryAndric for this suggestion! If you or someone knows how I can create the test case files (on Windows), I would be happy to do so!

@EugeneZelenko EugeneZelenko added clang-cl `clang-cl` driver. Don't use for other compiler parts and removed new issue labels Oct 18, 2023
@Neumann-A
Copy link

I observe the same issue with 17.0.4. Trying to install opencv4 via vcpkg using my clang-cl toolchain just hangs forever.

@emmenlau
Copy link
Author

Did anyone try with 17.0.5 yet?

@Neumann-A
Copy link

Did anyone try with 17.0.5 yet?

Same issue.

@Neumann-A
Copy link

@emmenlau Did you check your build logs carefully? I noticed when bisceting that there was an ICE somewhere at the start of the logs which I didn't notice at first. I don't know why ninja did not abort the build in this case and left it hanging.

@emmenlau
Copy link
Author

emmenlau commented Nov 29, 2023

Hi @Neumann-A ,
I checked by logs back then quite carefully: I compared them with a diff-view with a working build using clang 16.x. As far as I could see there was no ICE in my case with clang 17.0.3. But its interesting that you could move one step ahead!
One thing I found (but can not try myself): there is an option where clang would print the full diagnostics even without a crash! Maybe this can help developers isolate the problem?

Here is link: https://clang.llvm.org/docs/UsersManual.html#options-to-control-clang-crash-diagnostics

From that page, I quote:

Clang is also capable of generating preprocessed source file(s) and associated run script(s) even without a crash. This is specially useful when trying to generate a reproducer for warnings or errors while using modules.

I guess with such a reproducer, the developers could help resolve the issue...

@emmenlau emmenlau changed the title Regression building opencv 4.8.1 with clang-cl 17.0.3 on Windows over clang-cl 16.0.6, clang-cl just hangs Regression building opencv 4.8.1 with clang-cl 17.0.5 on Windows over clang-cl 16.0.6: Newer clang-cl just hangs Jan 8, 2024
@emmenlau
Copy link
Author

emmenlau commented Jan 8, 2024

Did anyone try clang-cl 17.0.6 yet?

@Neumann-A
Copy link

Did anyone try clang-cl 17.0.6 yet?

Same issue.
Master from end of November also same issue. I don't think it will be fixed for 18.

https://github.com/backengineering/llvm-msvc doesn't seem to have the issue

@emmenlau
Copy link
Author

emmenlau commented Jan 8, 2024

Thanks @Neumann-A ! I'll try to run the build today with -gen-reproducer in the hope that devs will consider fixing this issue.

@Neumann-A
Copy link

I mean I even bisected the issue. The ICE/hang happens since 0efe111. Reverting it fixes it.

@emmenlau
Copy link
Author

emmenlau commented Jan 8, 2024

Oh my, that is very relevant! Thanks for sharing!!!

@emmenlau emmenlau changed the title Regression building opencv 4.8.1 with clang-cl 17.0.5 on Windows over clang-cl 16.0.6: Newer clang-cl just hangs Regression since 0efe111365 when building opencv with clang-cl 17.0.6 on Windows: clang-cl just hangs Jan 8, 2024
@EugeneZelenko EugeneZelenko added the hang Compiler hang (infinite loop) label Jan 8, 2024
@emmenlau
Copy link
Author

Dear LLVM devs, could someone kindly consider this compiler hang? It is quite relevant to build openCV, which is a rather relevant library in the image analysis community...

@DimitryAndric
Copy link
Collaborator

Somebody still needs to provide a .sh and .cpp file from one of those hanging builds. This is essential for reproducing the problem, and attempting to fix it.

@DimitryAndric
Copy link
Collaborator

Also, ping @phoebewang @efriedma-quic @tentzen who originated https://reviews.llvm.org/D102817 for commit 0efe111.

@Neumann-A
Copy link

Somebody still needs to provide a .sh and .cpp file from one of those hanging builds. This is essential for reproducing the problem, and attempting to fix it.

#73538 and #73536 have reproducers and are related

@DimitryAndric
Copy link
Collaborator

I managed to configure and build opencv on Windows against Visual Studio and the official LLVM 17.0.6 package, and could intermittently reproduce the hangs: that is, some clang-cl instances hung but not consistently when you repeated the exact same command line.

After retrieving the exact command line for a hanging instance, I could generate a preprocessed test case, where the hang occurs when compiling the intermediate .bc to .asm:

"C:\Program Files\LLVM\bin\clang-cl.exe" -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -save-temps=cwd -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name execution_engine.cpp -mrelocation-model pic -pic-level 2 -mframe-pointer=none -relaxed-aliasing -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -target-feature +sse -target-feature +sse2 -target-feature +sse3 -mllvm -x86-asm-syntax=intel -tune-cpu generic -D_MT -D_DLL --dependent-lib=msvcrt --dependent-lib=oldnames --show-includes -sys-header-deps -stack-protector 2 -fexceptions -fasync-exceptions -fms-volatile -fdiagnostics-format msvc -v -ffunction-sections "-fcoverage-compilation-dir=C:\Users\Dim\Source\opencv\build" -resource-dir "C:\PROGRA~1\LLVM\lib\clang\17" -O2 -WCL4 -W -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -Wno-deprecated-enum-enum-conversion -Wno-deprecated-anon-enum-enum-conversion -Wno-long-long "-fdebug-compilation-dir=C:\Users\Dim\Source\opencv\build" -ferror-limit 19 -fmessage-length=178 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.38.33134 -fdelayed-template-parsing -finline-functions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -o execution_engine.asm -x ir execution_engine.bc

I transported this test case to Linux where I have more tools to do reduction, and I ended up with the following reduced test case:

// clang-cl -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -disable-llvm-verifier -fexceptions -fasync-exceptions -O2 execution_engine-min.cpp
template <bool, class _Ty1, class> using conditional_t = _Ty1;
template <class _Ty1, class _Ty2>
constexpr bool is_same_v = __is_same(_Ty1, _Ty2);
struct _Alloc_construct_ptr {
  ~_Alloc_construct_ptr();
};
template <class _Alnode> struct _List_node_emplace_op2 : _Alloc_construct_ptr {
  _List_node_emplace_op2(_Alnode);
  ~_List_node_emplace_op2() { ; }
};
int _List;
struct {
  template <class... _Valtys>
  conditional_t<is_same_v<int, int>, int, int> emplace(_Valtys... _Vals) {
    _List_node_emplace_op2(_List, _Vals...);
  }
} m_executableDependencies;
void ExecutionEngineaddExecutableDependency() {
  m_executableDependencies.emplace();
}

This reliably produces an assertion in WinEHPrepare.cpp (if the llvm in question is compiled with assertions, which the release builds are not), after an initial "A single unwind edge may only enter one EH pad" error:

A single unwind edge may only enter one EH pad
  invoke void @llvm.seh.scope.end()
          to label %"??1?$_List_node_emplace_op2@H@@QEAA@XZ.exit.i" unwind label %ehcleanup.i.i
Assertion failed: (!verifyFunction(F, &dbgs())), function prepareExplicitEH, file /share/dim/src/llvm/llvm-project/llvm/lib/CodeGen/WinEHPrepare.cpp, line 1210.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -disable-llvm-verifier -fexceptions -fasync-exceptions -O2 execution_engine-min.cpp
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'execution_engine-min.cpp'.
4.      Running pass 'Windows exception handling preparation' on function '@"?ExecutionEngineaddExecutableDependency@@YAXXZ"'
 #0 0x00000000042d05c8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x42d05c8)
 #1 0x00000000042ce129 llvm::sys::RunSignalHandlers() (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x42ce129)
 #2 0x00000000042d0dc8 SignalHandler(int) Signals.cpp:0:0
 #3 0x00000008297c6490 handle_signal /share/dim/src/freebsd/llvm-18-update/lib/libthr/thread/thr_sig.c:0:3
 #4 0x00000008297c5a4b thr_sighandler /share/dim/src/freebsd/llvm-18-update/lib/libthr/thread/thr_sig.c:245:1
 #5 0x00000008290772d3 ([vdso]+0x2d3)
 #6 0x000000082e398e1a _thr_kill /usr/obj/share/dim/src/freebsd/llvm-18-update/amd64.amd64/lib/libc/thr_kill.S:4:0
 #7 0x000000082e312a94 __raise /share/dim/src/freebsd/llvm-18-update/lib/libc/gen/raise.c:0:10
 #8 0x000000082e3c5799 abort /share/dim/src/freebsd/llvm-18-update/lib/libc/stdlib/abort.c:67:17
 #9 0x000000082e2f5d81 (/lib/libc.so.7+0x99d81)
#10 0x0000000003c3cc1a (anonymous namespace)::WinEHPrepareImpl::prepareExplicitEH(llvm::Function&) WinEHPrepare.cpp:0:0
#11 0x0000000003c389b1 (anonymous namespace)::WinEHPrepare::runOnFunction(llvm::Function&) WinEHPrepare.cpp:0:0
#12 0x0000000003defcb1 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3defcb1)
#13 0x0000000003df82a4 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3df82a4)
#14 0x0000000003df086e llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3df086e)
#15 0x0000000004a43178 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4a43178)
#16 0x0000000004a62d19 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4a62d19)
#17 0x000000000671a8c6 clang::ParseAST(clang::Sema&, bool, bool) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x671a8c6)
#18 0x0000000004e61883 clang::FrontendAction::Execute() (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4e61883)
#19 0x0000000004dd63cd clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4dd63cd)
#20 0x0000000004f39305 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4f39305)
#21 0x0000000002722edc cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x2722edc)
#22 0x000000000271fcd2 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#23 0x000000000271eb3d clang_main(int, char**, llvm::ToolContext const&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x271eb3d)
#24 0x000000000272ea14 main (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x272ea14)
#25 0x000000082e2e734a __libc_start1 /share/dim/src/freebsd/llvm-18-update/lib/libc/csu/libc_start1.c:157:2

This is the same error and assertion reported in #73536 and #73538.

@phoebewang
Copy link
Contributor

cc @robertcox-github

phoebewang added a commit that referenced this issue Apr 4, 2024
Intrinsics like @llvm.seh.scope.begin and @llvm.seh.scope.end which do
not throw do not need funclets in catchpads or cleanuppads.

Fixes #69428

Co-authored-by: Robert Cox <robert.cox@intel.com>

---------

Co-authored-by: Robert Cox <robert.cox@intel.com>
@EugeneZelenko EugeneZelenko added llvm:ir and removed clang-cl `clang-cl` driver. Don't use for other compiler parts labels Apr 4, 2024
@emmenlau
Copy link
Author

Is a1f4ac7 in ClangCl 18.1.4? Because I still can not build OpenCV, it still hangs :-( :-(

@phoebewang
Copy link
Contributor

Is a1f4ac7 in ClangCl 18.1.4? Because I still can not build OpenCV, it still hangs :-( :-(

Not, it's not getting cherry-picked to 18 release.

@emmenlau
Copy link
Author

Thanks a lot for the feedback @phoebewang ! And there are probably strong reasons against picking it for the 18.x series, yes? It would be really great to have OpenCV build working again... :-(

@Nick-Kooij
Copy link

The recently released Microsoft Visual Studio 2022 17.9.10 now ships with headers that only support LLVM 17 and up, compounding the issue.

Developers impacted by this issue, using clang-cl on Windows with using the latest Microsoft IDE/header, will be soon be stuck:

  • LLVM 16 can't be used due to lack of header support
  • LLVM 17 and up crash

@EugeneZelenko
Copy link
Contributor

Thanks a lot for the feedback @phoebewang ! And there are probably strong reasons against picking it for the 18.x series, yes? It would be really great to have OpenCV build working again... :-(

Backports to 18.1.x were stopped recently, so it'll be necessary to wait for 19.

@efriedma-quic
Copy link
Collaborator

This specifically only impacts code built with /EHa; does OpenCV really need to be built with that flag?

@phoebewang
Copy link
Contributor

This specifically only impacts code built with /EHa; does OpenCV really need to be built with that flag?

Right. The /EHa was a dud in LLVM before 17. Removing it should have no side effect.

@emmenlau
Copy link
Author

emmenlau commented May 23, 2024

@efriedma-quic , thanks a billion for this insight! No, I do not require /EHa and can happily build without. This reduces the severity of this issue a whole lot!

This does not work for me. I've replaced /EHa with /EHsc and the compiler (clang-cl 18.1.6 from 5 days ago) still hangs (currently 15 minutes on a single source file, and counting).

Did you mean to disable all exception handling? Or are there exception handling models that are expected to work?

@phoebewang
Copy link
Contributor

/EHsc would be an independent issue. How about just removing /EHa?

@emmenlau
Copy link
Author

After removing /EHa the build complains that exceptions are not enabled (unless I'm just unable to configure OpenCV correctly and did a mistake, haha). Is that likely possible that removing /EHa alltogether disables exceptions?

@phoebewang
Copy link
Contributor

No sure, what I know is /EHa doesn't really enable asynchronous exceptions before LLVM17. Maybe it enabled partial or maybe the build script just checking the /EH strings?

@emmenlau
Copy link
Author

I've spend some time reading up on this, and I'm under the impression that any of the /EHx options needs to be added, otherwise exceptions are turned off by the compiler. See for example here and here.

This explains why removing /EHa disabled exceptions in opencv alltogether, thereby breaking the build.

/EHsc would be an independent issue.

Can you elaborate about an independent issue? So does the current fix in clang 19 not address the issue of a compiler hang with /EHsc? Should I report it as a new issue?

@phoebewang
Copy link
Contributor

This explains why removing /EHa disabled exceptions in opencv alltogether, thereby breaking the build.

I'm not expert of Clang driver. Just took a quick look, maybe you can try -Xclang -fcxx-exceptions -Xclang -fexceptions to enable exceptions without a /EH*?

Can you elaborate about an independent issue? So does the current fix in clang 19 not address the issue of a compiler hang with /EHsc? Should I report it as a new issue?

I didn't see it else where. My justification is 1) /EHsc would not generate such llvm.seh.* intrinsics 2) the fixed issue is a crash issue if compiler built with assert on. A hang may or may not related with it. Did you check if the /EHsc option works with trunk code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hang Compiler hang (infinite loop) llvm:ir
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants