Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang crash when build IPEX code. #75428

Open
xuhancn opened this issue Dec 14, 2023 · 8 comments
Open

clang crash when build IPEX code. #75428

xuhancn opened this issue Dec 14, 2023 · 8 comments
Labels
crash Prefer [crash-on-valid] or [crash-on-invalid] llvm:optimizations

Comments

@xuhancn
Copy link

xuhancn commented Dec 14, 2023

Build cmd:

cd /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu && /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp

The error msg:

Stack dump:
0.      Program arguments: /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1.      <eof> parser at end of file
2.      Per-function optimization
3.      Running pass 'Early CSE' on function '@.omp_outlined.'
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7fa30e5db4ff]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7fa30e5d97b0]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15CleanupOnSignalEm+0xdd)[0x7fa30e5dac4d]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x8d6e60)[0x7fa30e530e60]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fa314daa420]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x173ee23)[0x7fa30f398e23]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm19SimplifyInstructionEPNS_11InstructionERKNS_13SimplifyQueryEPNS_25OptimizationRemarkEmitterE+0x819)[0x7fa30f3a3d09]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13700b2)[0x7fa30efca0b2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13759e4)[0x7fa30efcf9e4]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x466)[0x7fa30e6e0d76]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy23FunctionPassManagerImpl3runERNS_8FunctionE+0x4e)[0x7fa30e6e049e]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy19FunctionPassManager3runERNS_8FunctionE+0x156)[0x7fa30e6e0436]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang17EmitBackendOutputERNS_17DiagnosticsEngineERKNS_19HeaderSearchOptionsERKNS_14CodeGenOptionsERKNS_13TargetOptionsERKNS_11LangOptionsERKN4llvm10DataLayoutEPNSE_6ModuleENS_13BackendActionESt10unique_ptrINSE_17raw_pwrite_streamESt14default_deleteISM_EE+0x305b)[0x7fa3136d631b]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x1667e1c)[0x7fa313955e1c]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang8ParseASTERNS_4SemaEbb+0x283)[0x7fa312b43c13]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang14FrontendAction7ExecuteEv+0x48)[0x7fa313fb9e58]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang16CompilerInstance13ExecuteActionERNS_14FrontendActionE+0x621)[0x7fa313f728a1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang25ExecuteCompilerInvocationEPNS_16CompilerInstanceE+0x66f)[0x7fa31401ddaf]
/usr/bin/clang++(_Z8cc1_mainN4llvm8ArrayRefIPKcEES2_Pv+0x98d)[0x41229d]
/usr/bin/clang++[0x4105b1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x19d58f2)[0x7fa313cc38f2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm20CrashRecoveryContext9RunSafelyENS_12function_refIFvvEEE+0xd7)[0x7fa30e530c67]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver10CC1Command7ExecuteEN4llvm8ArrayRefINS2_8OptionalINS2_9StringRefEEEEEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPb+0x13f)[0x7fa313cc2e2f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation14ExecuteCommandERKNS0_7CommandERPS3_+0x2df)[0x7fa313c9b52f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation11ExecuteJobsERKNS0_7JobListERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0x7a)[0x7fa313c9b6da]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang6driver6Driver18ExecuteCompilationERNS0_11CompilationERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0xdc)[0x7fa313cae93c]
/usr/bin/clang++(main+0x259f)[0x41002f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fa30d73e083]
/usr/bin/clang++(_start+0x2e)[0x40d7ce]
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/fused_bert-089380.cpp
clang: note: diagnostic msg: /tmp/fused_bert-089380.sh
clang: note: diagnostic msg:

********************

original code: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp , and it can build by gcc.
diagnostic msg files also attached:
fused_bert-089380.zip

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Dec 14, 2023
@EugeneZelenko EugeneZelenko added llvm:optimizations crash Prefer [crash-on-valid] or [crash-on-invalid] and removed clang Clang issues not falling into any other category labels Dec 14, 2023
@asl
Copy link
Collaborator

asl commented Dec 14, 2023

LLVM 10 is ancient. Does the problem reproduce with latest LLVM?

@xuhancn
Copy link
Author

xuhancn commented Dec 14, 2023

LLVM 10 is ancient. Does the problem reproduce with latest LLVM?

Sure, Wait for a while.

@xuhancn
Copy link
Author

xuhancn commented Dec 14, 2023

Crash 1 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/optim.cpp

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/optim.cpp
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x0000556ce2f9752f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
 #1 0x0000556ce2f9557c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
 #2 0x0000556ce2edc2c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f3eff376420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x0000556ce20693f1 simplifyMulInst(llvm::Value*, llvm::Value*, bool, bool, llvm::SimplifyQuery const&, unsigned int) (.constprop.0) InstructionSimplify.cpp:0:0
 #5 0x0000556ce2061741 simplifyInstructionWithOperands(llvm::Instruction*, llvm::ArrayRef<llvm::Value*>, llvm::SimplifyQuery const&, unsigned int) InstructionSimplify.cpp:0:0
 #6 0x0000556ce206b757 llvm::simplifyInstruction(llvm::Instruction*, llvm::SimplifyQuery const&) (/usr/local/bin/clang-18+0x27f9757)
 #7 0x0000556ce2d7e08c (anonymous namespace)::EarlyCSE::processNode(llvm::DomTreeNodeBase<llvm::BasicBlock>*) EarlyCSE.cpp:0:0
 #8 0x0000556ce2d805ed (anonymous namespace)::EarlyCSE::run() EarlyCSE.cpp:0:0
 #9 0x0000556ce2d821f6 llvm::EarlyCSEPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x35101f6)
#10 0x0000556ce31f1446 llvm::detail::PassModel<llvm::Function, llvm::EarlyCSEPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x397f446)
#11 0x0000556ce0aa0c04 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x122ec04)
#12 0x0000556ce299a27e llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x312827e)
#13 0x0000556ce0a94ee6 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x1222ee6)
#14 0x0000556ce2996d20 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x3124d20)
#15 0x0000556ce3201bb7 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#16 0x0000556ce3204ec4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/usr/local/bin/clang-18+0x3992ec4)
#17 0x0000556ce37d4d5e clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62d5e)
#18 0x0000556ce52becf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#19 0x0000556ce37d4135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#20 0x0000556ce3a644c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#21 0x0000556ce39deaeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#22 0x0000556ce3b42b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#23 0x0000556ce073c79d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#24 0x0000556ce07350ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#25 0x0000556ce381cd3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#26 0x0000556ce2edc747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#27 0x0000556ce381d1dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#28 0x0000556ce37e3d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#29 0x0000556ce37e475d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#30 0x0000556ce37eebdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#31 0x0000556ce0739ac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#32 0x0000556ce06411b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#33 0x00007f3efed7b083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#34 0x0000556ce073486e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:57:18: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<c10::BFloat16>::operator()' requested here
   57 |     trans_n2v_tpp(in[n], out[n]);
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:203:16: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<float>::operator()' requested here
  203 |       trans_tpp(in[n], out[n]);
      |                ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<float, float>::operator()' requested here
  187 |             qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  834 |       float tmp[cols];
      |                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<float>::operator()' requested here
  318 |                 add_mask_tpp(AM[s21], AS[ls21][0]);
      |                             ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  834 |       float tmp[cols];
      |                 ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, float>::operator()' requested here
  320 |             softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
      |                            ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
 3189 |   void operator()(int S1, Tin* in, Tout* out) {
      |                       ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3201 |         LIBXSMM_ALIGNED(float tmp2[S3], 64);
      |                                    ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, c10::BFloat16>::operator()' requested here
  187 |             qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:315:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, float>::operator()' requested here
  315 |               a_gemm_tpp(QL[s11][n], KL_TV[s21][n], AS[ls21][0], 1);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  834 |       float tmp[cols];
      |                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<c10::BFloat16>::operator()' requested here
  318 |                 add_mask_tpp(AM[s21], AS[ls21][0]);
      |                             ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  834 |       float tmp[cols];
      |                 ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, c10::BFloat16>::operator()' requested here
  320 |             softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
      |                            ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
 3189 |   void operator()(int S1, Tin* in, Tout* out) {
      |                       ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3201 |         LIBXSMM_ALIGNED(float tmp2[S3], 64);
      |                                    ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  916 |     float tmp[cols];
      |               ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:543:26: note: in instantiation of member function 'torch_ipex::tpp::GradBiasTPP<float>::operator()' requested here
  543 |             grad_bias_tpp(dQL[s1][n], prv_grad_bias[n]);
      |                          ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  916 |     float tmp[cols];
      |               ^
clang-18: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.cpp
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.sh
clang-18: note: diagnostic msg:
********************

optim-0c9d5d.zip

@xuhancn
Copy link
Author

xuhancn commented Dec 14, 2023

Crash 2 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1.      <eof> parser at end of file
2.      Per-file LLVM IR generation
3.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:86:32: Generating code for declaration 'torch_ipex::tpp::fused_self_attention_bwd_unpad'
4.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:90:40: LLVM IR generation of compound statement ('{}')
5.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:88:1: LLVM IR generation of compound statement ('{}')
6.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:528:3: LLVM IR generation of compound statement ('{}')
7.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:532:5: LLVM IR generation of compound statement ('{}')
8.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:535:7: LLVM IR generation of compound statement ('{}')
 #0 0x000055baaa86d52f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
 #1 0x000055baaa86b57c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
 #2 0x000055baaa7b22c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007fdc92b01420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x000055baa857e8f7 llvm::IRBuilderBase::CreateMul(llvm::Value*, llvm::Value*, llvm::Twine const&, bool, bool) (/usr/local/bin/clang-18+0x14368f7)
 #5 0x000055baaaf4a155 clang::CodeGen::CodeGenFunction::EmitArraySubscriptExpr(clang::ArraySubscriptExpr const*, bool) (/usr/local/bin/clang-18+0x3e02155)
 #6 0x000055baaaf42b07 clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) (/usr/local/bin/clang-18+0x3dfab07)
 #7 0x000055baaaf44bf8 clang::CodeGen::CodeGenFunction::EmitArrayToPointerDecay(clang::Expr const*, clang::CodeGen::LValueBaseInfo*, clang::CodeGen::TBAAAccessInfo*) (/usr/local/bin/clang-18+0x3dfcbf8)
 #8 0x000055baaaf97cf8 (anonymous namespace)::ScalarExprEmitter::VisitCastExpr(clang::CastExpr*) CGExprScalar.cpp:0:0
 #9 0x000055baaaf8e363 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#10 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#11 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#12 0x000055baaaf33357 clang::CodeGen::CodeGenFunction::EmitAnyExprToTemp(clang::Expr const*) (/usr/local/bin/clang-18+0x3deb357)
#13 0x000055baaaec1a8b clang::CodeGen::CodeGenFunction::EmitCallArg(clang::CodeGen::CallArgList&, clang::Expr const*, clang::QualType) (/usr/local/bin/clang-18+0x3d79a8b)
#14 0x000055baaaeca583 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, clang::CodeGen::CodeGenFunction::PrototypeWrapper, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const>>, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/usr/local/bin/clang-18+0x3d82583)
#15 0x000055baaaf3af8e clang::CodeGen::CodeGenFunction::EmitCall(clang::QualType, clang::CodeGen::CGCallee const&, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot, llvm::Value*) (/usr/local/bin/clang-18+0x3df2f8e)
#16 0x000055baaaf4fd44 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/usr/local/bin/clang-18+0x3e07d44)
#17 0x000055baaaf98a08 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) CGExprScalar.cpp:0:0
#18 0x000055baaaf8d420 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#19 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#20 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#21 0x000055baaaf4dd13 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/usr/local/bin/clang-18+0x3e05d13)
#22 0x000055baaab708c3 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a288c3)
#23 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#24 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#25 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#26 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#27 0x000055baaabac283 void clang::CodeGen::RegionCodeGenTy::CallbackFn<clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&)::'lambda2'(clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&)>(long, clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&) CGStmtOpenMP.cpp:0:0
#28 0x000055baab025e8b clang::CodeGen::RegionCodeGenTy::operator()(clang::CodeGen::CodeGenFunction&) const (/usr/local/bin/clang-18+0x3edde8b)
#29 0x000055baab02d4d7 (anonymous namespace)::CGOpenMPRegionInfo::EmitBody(clang::CodeGen::CodeGenFunction&, clang::Stmt const*) CGOpenMPRuntime.cpp:0:0
#30 0x000055baaabc2e6b clang::CodeGen::CodeGenFunction::GenerateOpenMPCapturedStmtFunction(clang::CapturedStmt const&, clang::SourceLocation) (/usr/local/bin/clang-18+0x3a7ae6b)
#31 0x000055baab05a87d emitParallelOrTeamsOutlinedFunction(clang::CodeGen::CodeGenModule&, clang::OMPExecutableDirective const&, clang::CapturedStmt const*, clang::VarDecl const*, llvm::omp::Directive, llvm::StringRef, clang::CodeGen::RegionCodeGenTy const&) CGOpenMPRuntime.cpp:0:0
#32 0x000055baab05ab86 clang::CodeGen::CGOpenMPRuntime::emitParallelOutlinedFunction(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, clang::VarDecl const*, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&) (/usr/local/bin/clang-18+0x3f12b86)
#33 0x000055baaaba5c78 emitCommonOMPParallelDirective(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&, llvm::function_ref<void (clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::SmallVectorImpl<llvm::Value*>&)> const&) CGStmtOpenMP.cpp:0:0
#34 0x000055baaaba7097 clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&) (/usr/local/bin/clang-18+0x3a5f097)
#35 0x000055baaab70b75 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28b75)
#36 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#37 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#38 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#39 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#40 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#41 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#42 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#43 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#44 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#45 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#46 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#47 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#48 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#49 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#50 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#51 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#52 0x000055baaab76261 clang::CodeGen::CodeGenFunction::EmitIfStmt(clang::IfStmt const&) (/usr/local/bin/clang-18+0x3a2e261)
#53 0x000055baaab70e89 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28e89)
#54 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#55 0x000055baaabe39ad clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/local/bin/clang-18+0x3a9b9ad)
#56 0x000055baaac3e5bd clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af65bd)
#57 0x000055baaac39fd5 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af1fd5)
#58 0x000055baaac43bf6 clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbbf6)
#59 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#60 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#61 0x000055baaac461e6 clang::CodeGen::CodeGenModule::Release() (/usr/local/bin/clang-18+0x3afe1e6)
#62 0x000055baab0abf72 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0
#63 0x000055baab0aab04 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62b04)
#64 0x000055baacb94cf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#65 0x000055baab0aa135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#66 0x000055baab33a4c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#67 0x000055baab2b4aeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#68 0x000055baab418b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#69 0x000055baa801279d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#70 0x000055baa800b0ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#71 0x000055baab0f2d3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#72 0x000055baaa7b2747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#73 0x000055baab0f31dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#74 0x000055baab0b9d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#75 0x000055baab0ba75d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#76 0x000055baab0c4bdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#77 0x000055baa800fac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#78 0x000055baa7f171b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#79 0x00007fdc92506083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#80 0x000055baa800a86e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
clang-18: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.cpp
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.sh
clang-18: note: diagnostic msg:

********************

fused_bert-63a747.zip

@xuhancn
Copy link
Author

xuhancn commented Dec 14, 2023

@asl I have clone the latest llvm code and build the compiler. The issue still occurred in clang-18.

@jyu2-git
Copy link
Contributor

@jyu2-git jyu2-git added clang Clang issues not falling into any other category openmp labels Dec 27, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 27, 2023

@llvm/issue-subscribers-openmp

Author: Xu Han (xuhancn)

Build cmd:

cd /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu && /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp

The error msg:

Stack dump:
0.      Program arguments: /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1.      &lt;eof&gt; parser at end of file
2.      Per-function optimization
3.      Running pass 'Early CSE' on function '@.omp_outlined.'
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7fa30e5db4ff]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7fa30e5d97b0]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15CleanupOnSignalEm+0xdd)[0x7fa30e5dac4d]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x8d6e60)[0x7fa30e530e60]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fa314daa420]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x173ee23)[0x7fa30f398e23]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm19SimplifyInstructionEPNS_11InstructionERKNS_13SimplifyQueryEPNS_25OptimizationRemarkEmitterE+0x819)[0x7fa30f3a3d09]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13700b2)[0x7fa30efca0b2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13759e4)[0x7fa30efcf9e4]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x466)[0x7fa30e6e0d76]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy23FunctionPassManagerImpl3runERNS_8FunctionE+0x4e)[0x7fa30e6e049e]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy19FunctionPassManager3runERNS_8FunctionE+0x156)[0x7fa30e6e0436]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang17EmitBackendOutputERNS_17DiagnosticsEngineERKNS_19HeaderSearchOptionsERKNS_14CodeGenOptionsERKNS_13TargetOptionsERKNS_11LangOptionsERKN4llvm10DataLayoutEPNSE_6ModuleENS_13BackendActionESt10unique_ptrINSE_17raw_pwrite_streamESt14default_deleteISM_EE+0x305b)[0x7fa3136d631b]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x1667e1c)[0x7fa313955e1c]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang8ParseASTERNS_4SemaEbb+0x283)[0x7fa312b43c13]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang14FrontendAction7ExecuteEv+0x48)[0x7fa313fb9e58]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang16CompilerInstance13ExecuteActionERNS_14FrontendActionE+0x621)[0x7fa313f728a1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang25ExecuteCompilerInvocationEPNS_16CompilerInstanceE+0x66f)[0x7fa31401ddaf]
/usr/bin/clang++(_Z8cc1_mainN4llvm8ArrayRefIPKcEES2_Pv+0x98d)[0x41229d]
/usr/bin/clang++[0x4105b1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x19d58f2)[0x7fa313cc38f2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm20CrashRecoveryContext9RunSafelyENS_12function_refIFvvEEE+0xd7)[0x7fa30e530c67]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver10CC1Command7ExecuteEN4llvm8ArrayRefINS2_8OptionalINS2_9StringRefEEEEEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPb+0x13f)[0x7fa313cc2e2f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation14ExecuteCommandERKNS0_7CommandERPS3_+0x2df)[0x7fa313c9b52f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation11ExecuteJobsERKNS0_7JobListERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0x7a)[0x7fa313c9b6da]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang6driver6Driver18ExecuteCompilationERNS0_11CompilationERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0xdc)[0x7fa313cae93c]
/usr/bin/clang++(main+0x259f)[0x41002f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fa30d73e083]
/usr/bin/clang++(_start+0x2e)[0x40d7ce]
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/fused_bert-089380.cpp
clang: note: diagnostic msg: /tmp/fused_bert-089380.sh
clang: note: diagnostic msg:

********************

original code: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp , and it can build by gcc.
diagnostic msg files also attached:
fused_bert-089380.zip

@EugeneZelenko EugeneZelenko removed the clang Clang issues not falling into any other category label Dec 27, 2023
@jyu2-git jyu2-git added the clang Clang issues not falling into any other category label Dec 27, 2023
@AngryLoki
Copy link
Contributor

AngryLoki commented Jan 31, 2024

For those, who are searching for a workaround, mass-replacing VLA with wrapped arrays helps:

Details

template <size_t N, typename T> class BlockedArray;

template <typename T> class BlockedArray<1, T> {
public:
  T *data;
  size_t size;
  constexpr BlockedArray(T *data, std::array<size_t, 1> sizes)
      : data{data}, size{sizes[0]} {}
  constexpr auto operator[](int n) { return &data[size * n]; }
  constexpr explicit operator bool() const { return data != nullptr; }
  constexpr auto operator*() const { return data; }
};

template <size_t N, typename T> class BlockedArray {
public:
  T *data;
  std::array<size_t, N> sizes;
  constexpr BlockedArray(T *data, std::array<size_t, N> sizes)
      : data{data}, sizes{sizes} {}
  constexpr auto operator[](int n) const {
    auto head = head_aux(std::make_index_sequence<N - 1>());
    return BlockedArray<N - 1, T>{&data[sizes[0] * n], head};
  }
  constexpr explicit operator bool() const { return data != nullptr; }
  template <size_t... I>
  constexpr auto head_aux(std::index_sequence<I...>) const {
    return std::array<size_t, N - 1>{sizes[I]...};
  }
  constexpr auto operator*() const { return operator[](0); }
};

template <size_t N = 0> struct BlockedArrayDims {
  constexpr BlockedArrayDims(std::array<size_t, N> items) : items(items) {}
  template <size_t... I>
  constexpr auto items_prepend(size_t t, std::index_sequence<I...>) const {
    return std::array<size_t, N + 1>{t, items[I]...};
  }

public:
  std::array<size_t, N> items;
  constexpr BlockedArrayDims() = default;

  constexpr BlockedArrayDims<N + 1> operator[](size_t t) const {
    return items_prepend(t, std::make_index_sequence<N>());
  }
};

#define DECL_VLA_PTR_PT(type, name, dims, t)                                   \
  auto name = BlockedArray((type *)t, BlockedArrayDims() dims.items)

Update: found cleaner solution:

void f(void *a, long n) {
    // this causes crash
    // auto b = reinterpret_cast<float (*)[n]>(a);

    // but this works!
    using array_type = float (*)[n];
    array_type b = reinterpret_cast<array_type>(a);

#pragma omp parallel
    b[0];
}

@EugeneZelenko EugeneZelenko removed clang Clang issues not falling into any other category openmp labels Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crash Prefer [crash-on-valid] or [crash-on-invalid] llvm:optimizations
Projects
None yet
Development

No branches or pull requests

6 participants