Skip to content

[SYCL-MLIR][RODINIA]: Rodinia performance benchmarks for SYCL #7641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
etiotto opened this issue Dec 5, 2022 · 23 comments
Open

[SYCL-MLIR][RODINIA]: Rodinia performance benchmarks for SYCL #7641

etiotto opened this issue Dec 5, 2022 · 23 comments
Assignees
Labels
enhancement New feature or request sycl-mlir Pull requests or issues for sycl-mlir branch

Comments

@etiotto
Copy link

etiotto commented Dec 5, 2022

The Rodinia benchmarks have been ported to SYCL and are available publicly at: https://github.com/zjin-lcf/Rodinia_SYCL.
I have forked the repository and modified the Makefiles to be able to compile the benchmarks with the Intel clang++ SYCL compiler and with the SYCL-MLIR compiler. The fork with the required changes (use branch sycl_mlir) is at: https://github.com/etiotto/Rodinia_SYCL/tree/sycl_mlir.

We can initially focus on 5 bmks: backprop, gaussian, particlefilter, streamcluster and lud. These 5 bmks compile and run cleanly using the clang++ SYCL Intel compiler, and gaussian also compiles and runs when the SYCL-MLIR/cgeist compiler is used. The remaining 4 bmks fail to compile, with the same symptom:

/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall -DBLOCK_SIZE=256  -O3 ex_particle_SYCL_single_seq.cpp -c
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w ex_particle_SYCL_single_seq.cpp -o /tmp/ex_particle_SYCL_single_seq-d3a65b.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/ex_particle_SYCL_single_seq-header-af3a7d.h -fsycl-int-footer=/tmp/ex_particle_SYCL_single_seq-footer-6eb86f.h -sycl-std=2020 -fsycl-unique-prefix=9204d406c7a65583 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name ex_particle_SYCL_single_seq.cpp -fsycl-use-main-file-name -full-main-file-name ex_particle_SYCL_single_seq.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -D BLOCK_SIZE=256 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/particlefilter -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/ex_particle_SYCL_single_seq-d3a65b.bc -x c++ ex_particle_SYCL_single_seq.cpp
1.      <eof> parser at end of file
 #0 0x00007fa77cd73f53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4f53)
 #1 0x00007fa77cd721c0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d31c0)
 #2 0x00007fa77cd7443f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007fa793942b20 __restore_rt sigaction.c:0:0
 #4 0x00007fa785e75233 mlir::MemRefType::getElementType() const (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libMLIRIR.so.16git+0xc9233)
 #5 0x00000000004457d2 isSYCLInheritType(mlir::Type&, mlir::Value&) driver.cc:0:0
 #6 0x00000000004332f1 MLIRScanner::GetAddressOfBaseClass(mlir::Value, clang::CXXRecordDecl const*, llvm::ArrayRef<clang::Type const*>, llvm::ArrayRef<bool>) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4332f1)
 #7 0x0000000000430465 MLIRScanner::init(mlir::FunctionOpInterface, FunctionToEmit const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x430465)
 #8 0x000000000044a1b5 MLIRASTConsumer::run() (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44a1b5)
 #9 0x00007fa776ae1da0 clang::ParseAST(clang::Sema&, bool, bool) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/../lib/libclangParse.so.16git+0x38da0)
#10 0x00007fa782d53221 clang::FrontendAction::Execute() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libclangFrontend.so.16git+0x12c221)
#11 0x00000000004549a8 processInputFiles(llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, mlir::MLIRContext&, mlir::OwningOpRef<mlir::ModuleOp>&, llvm::DataLayout&, llvm::Triple&, char const*, bool) driver.cc:0:0
#12 0x000000000044fb2e main (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44fb2e)
#13 0x00007fa77bece493 __libc_start_main (/lib64/libc.so.6+0x23493)
#14 0x000000000042d32e _start (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x42d32e)
clang-16: error: unable to execute command: Segmentation fault (core dumped)
clang-16: error: cgeist command failed due to signal (use -v to see invocation)
clang version 16.0.0 (https://github.com/etiotto/intel-llvm.git 1bc16875da18bf80d183cb59744e95f9273faa51)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /iusers/etiotto/intel-llvm/build/bin
clang-16: error: unable to execute command: Segmentation fault (core dumped)
clang-16: note: diagnostic msg: Error generating preprocessed source(s).
make: *** [Makefile:58: ex_particle_SYCL_single_seq.o] Error 1
######## Finish particlefilter #########
@etiotto etiotto added the enhancement New feature or request label Dec 5, 2022
@etiotto etiotto self-assigned this Dec 5, 2022
@etiotto
Copy link
Author

etiotto commented Dec 5, 2022

Reduced test case for the problem illustrated in the description. Indexing array "A" works fine, indexing "shadow" is problematic.
The difference is that "A" is an accessor for a global buffer while "shadow" is an accessor to local memory.

#include <iostream>
#include <sycl/sycl.hpp>

using namespace sycl;

#define BLOCK_SIZE 16

int test(std::array<float, BLOCK_SIZE> &A) {
  auto q = queue{};
  auto range = sycl::range<1>{BLOCK_SIZE};
  auto bufA = buffer<float, 1>{A.data(), range};

  q.submit([&](handler &cgh) {
    accessor<float, 1, access::mode::read_write, access::target::local> shadow(
        BLOCK_SIZE, cgh);
    accessor<float, 1, access::mode::read_write, access::target::global_buffer>
        A(bufA, cgh);

    cgh.parallel_for<class diagonal>(range,
                                     [=](id<1> id) { A[0] = shadow[0] = 0; });
  });
}

@AlexeySachkov AlexeySachkov added the sycl-mlir Pull requests or issues for sycl-mlir branch label Dec 6, 2022
@etiotto
Copy link
Author

etiotto commented Dec 7, 2022

PRs #7663 and #7684 resolve the problem illustrated at #7641 (comment)

@etiotto
Copy link
Author

etiotto commented Dec 7, 2022

Next problem affecting streamcluster:

######## Start streamcluster #########
rm -f streamcluster streamcluster.linkinfo result*
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall   -O3 streamcluster.cpp -o streamcluster -lm 
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/ValueCategory.cc:187: void ValueCategory::store(mlir::OpBuilder &, mlir::Value) const: Assertion `toStore.getType() == val.getType().cast<MemRefType>().getElementType() && "expect same type"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w streamcluster.cpp -o /tmp/streamcluster-c49eb6.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/streamcluster-header-e64163.h -fsycl-int-footer=/tmp/streamcluster-footer-1a9696.h -sycl-std=2020 -fsycl-unique-prefix=27d026f783b3cce6 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name streamcluster.cpp -fsycl-use-main-file-name -full-main-file-name streamcluster.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/streamcluster -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/streamcluster-c49eb6.bc -x c++ streamcluster.cpp
1.      <eof> parser at end of file
 #0 0x00007f91d63fec53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
 #1 0x00007f91d63fcec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
 #2 0x00007f91d63ff13f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f91ecfe0b20 __restore_rt sigaction.c:0:0
 #4 0x00007f91d556d37f raise (/lib64/libc.so.6+0x3737f)
 #5 0x00007f91d5557db5 abort (/lib64/libc.so.6+0x21db5)
 #6 0x00007f91d5557c89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #7 0x00007f91d5565a76 .annobin___GI___assert_fail.end assert.c:0:0
 #8 0x00000000004d4e9c ValueCategory::store(mlir::OpBuilder&, mlir::Value) const (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4d4e9c)
 #9 0x00000000004a85fd MLIRScanner::VisitBinAssign(clang::BinaryOperator*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4a85fd)
#10 0x000000000043524c clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x43524c)
#11 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)
#12 0x0000000000435517 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435517)
#13 0x0000000000431547 MLIRScanner::init(mlir::FunctionOpInterface, FunctionToEmit const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x431547)
#14 0x000000000044aa55 MLIRASTConsumer::run() (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44aa55)
#15 0x00007f91d016adb0 clang::ParseAST(clang::Sema&, bool, bool) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/../lib/libclangParse.so.16git+0x38db0)
#16 0x00007f91dc3d8f61 clang::FrontendAction::Execute() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libclangFrontend.so.16git+0x12bf61)
#17 0x0000000000455248 processInputFiles(llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, mlir::MLIRContext&, mlir::OwningOpRef<mlir::ModuleOp>&, llvm::DataLayout&, llvm::Triple&, char const*, bool) driver.cc:0:0
#18 0x00000000004503ce main (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4503ce)
#19 0x00007f91d5559493 __libc_start_main (/lib64/libc.so.6+0x23493)
#20 0x000000000042da8e _start (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x42da8e)

@etiotto
Copy link
Author

etiotto commented Dec 7, 2022

(Fixed)
Next problem affecting backprop:

######## Start backprop #########
rm -f *.o *~ backprop *.linkinfo
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 backprop_sycl.cpp -c
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/CGCall.cc:89: void castCallerArgs(func::FuncOp, llvm::SmallVectorImpl<Value> &, mlir::OpBuilder &): Assertion `CalleeArgType == Args[I].getType() && "Callsite argument mismatch"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w backprop_sycl.cpp -o /tmp/backprop_sycl-d104c1.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/backprop_sycl-header-6a398c.h -fsycl-int-footer=/tmp/backprop_sycl-footer-aeaaf3.h -sycl-std=2020 -fsycl-unique-prefix=7b14c76bdff44000 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name backprop_sycl.cpp -fsycl-use-main-file-name -full-main-file-name backprop_sycl.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-
linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/backprop -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/backprop_sycl-d104c1.bc -x c++ backprop_sycl.cpp
1.      <eof> parser at end of file
 #0 0x00007f4e83bbec53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
 #1 0x00007f4e83bbcec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
 #2 0x00007f4e83bbf13f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f4e9a7a0b20 __restore_rt sigaction.c:0:0
 #4 0x00007f4e82d2d37f raise (/lib64/libc.so.6+0x3737f)
 #5 0x00007f4e82d17db5 abort (/lib64/libc.so.6+0x21db5)
 #6 0x00007f4e82d17c89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #7 0x00007f4e82d25a76 .annobin___GI___assert_fail.end assert.c:0:0
 #8 0x000000000047ce67 MLIRScanner::callHelper(mlir::func::FuncOp, clang::QualType, llvm::ArrayRef<std::pair<ValueCategory, clang::Expr*>>, clang::QualType, bool, clang::Expr*, clang::FunctionDecl const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x47ce67)
 #9 0x0000000000483a6a MLIRScanner::VisitCallExpr(clang::CallExpr*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x483a6a)
#10 0x00000000004352c1 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4352c1)
#11 0x0000000000439507 MLIRScanner::VisitVarDecl(clang::VarDecl*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x439507)
#12 0x00000000004bd68e MLIRScanner::VisitDeclStmt(clang::DeclStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd68e)
#13 0x0000000000435531 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435531)
#14 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)
#15 0x0000000000435517 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435517)
#16 0x0000000000431547 MLIRScanner::init(mlir::FunctionOpInterface, FunctionToEmit const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x431547)
#17 0x000000000044aa55 MLIRASTConsumer::run() (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44aa55)

Possible duplicate of #7662

@etiotto
Copy link
Author

etiotto commented Dec 7, 2022

EDIT: fixed by #7752

Next problem affecting particlefilter:

######## Start particlefilter #########
rm -f *.o *~ SYCL_particlefilter_single 
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall -DBLOCK_SIZE=256  -O3 ex_particle_SYCL_single_seq.cpp -c
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/ValueCategory.cc:585: ValueCategory FPBinOp(mlir::OpBuilder &, mlir::Location, mlir::Value, mlir::Value) [OpTy = mlir::arith::AddFOp]: Assertion `LHS.getType() == RHS.getType() && "Cannot operate on values of different types"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w ex_particle_SYCL_single_seq.cpp -o /tmp/ex_particle_SYCL_single_seq-e9151e.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/ex_particle_SYCL_single_seq-header-4a99c8.h -fsycl-int-footer=/tmp/ex_particle_SYCL_single_seq-footer-7c9355.h -sycl-std=2020 -fsycl-unique-prefix=78d00ad7068752e7 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name ex_particle_SYCL_single_seq.cpp -fsycl-use-main-file-name -full-main-file-name ex_particle_SYCL_single_seq.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -D BLOCK_SIZE=256 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/particlefilter -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/ex_particle_SYCL_single_seq-e9151e.bc -x c++ ex_particle_SYCL_single_seq.cpp
1.      <eof> parser at end of file
 #0 0x00007fac3bec1c53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
 #1 0x00007fac3bebfec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
 #2 0x00007fac3bec213f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007fac52aa3b20 __restore_rt sigaction.c:0:0
 #4 0x00007fac3b03037f raise (/lib64/libc.so.6+0x3737f)
 #5 0x00007fac3b01adb5 abort (/lib64/libc.so.6+0x21db5)
 #6 0x00007fac3b01ac89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #7 0x00007fac3b028a76 .annobin___GI___assert_fail.end assert.c:0:0
 #8 0x00000000004dbc03 ValueCategory::FAdd(mlir::OpBuilder&, mlir::Location, mlir::Value) const (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4dbc03)
 #9 0x00000000004a8bad MLIRScanner::EmitBinAdd(BinOpInfo const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4a8bad)
#10 0x00000000004abcab MLIRScanner::EmitCompoundAssignLValue(clang::CompoundAssignOperator*, ValueCategory (MLIRScanner::*)(BinOpInfo const&)) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4abcab)
#11 0x00000000004acfd4 MLIRScanner::VisitBinAddAssign(clang::BinaryOperator*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4acfd4)
#12 0x00000000004352a7 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4352a7)
#13 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)

@etiotto
Copy link
Author

etiotto commented Dec 12, 2022

In order to reproduce clone https://github.com/etiotto/Rodinia_SYCL/tree/sycl_mlir ( branch sycl_mlir). Edit the script sycl/run_all.sh to replace the clang compiler you want to use, then use it to run the bmks.

@etiotto
Copy link
Author

etiotto commented Dec 12, 2022

Using the latest SYCL-MLIR compiler backprop now compiles and run:

 ./run_all.sh                                                                               ✔  15s  etiotto@hds-clx-7 
######## Start backprop #########
rm -f *.o *~ backprop *.linkinfo
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 backprop_sycl.cpp -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 backprop.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 imagenet.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 facetrain.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall  -O3 backprop_sycl.o backprop.o imagenet.o facetrain.o -o backprop -lm 
./backprop 65536
Random number generator seed: 7
Input layer size : 65536
Starting training kernel
Performing GPU computation
Device offloading time = 0.638426(s)
...

@whitneywhtsang
Copy link
Contributor

I will start to look at the streamcluster issue reported here: #7641 (comment).

@etiotto
Copy link
Author

etiotto commented Dec 12, 2022

I'm looking at the other bmk that fails with an assert (particlefilter). Reduced test case:

#include <CL/sycl.hpp>

using namespace cl::sycl;
constexpr access::mode sycl_read_write = access::mode::read_write;

void likelyhood(int Nparticles) {
  cpu_selector dev_sel;
  queue q(dev_sel);
  const property_list props = property::buffer::use_host_ptr();
  float * arrayX = (float *) calloc(Nparticles, sizeof (float));
  buffer<float, 1>arrayX_GPU(arrayX, Nparticles, props);

  /****************** L I K E L I H O O D ************************************/
  q.submit([&](handler& cgh) {
    auto arrayX_acc = arrayX_GPU.get_access<sycl_read_write>(cgh);

    cgh.parallel_for<class likelihood>(
//      nd_range<1>(range<1>(global_work_size), range<1>(local_work_size)), [=] (nd_item<1> item) {
        nd_range<1>(range<1>(10), range<1>(20)), [=] (nd_item<1> item) {
        int i = item.get_global_linear_id();
        #ifdef BAD
                arrayX_acc[i] += 1.0; 
        #else
          arrayX_acc[i] = arrayX_acc[i] + 1.0; 
        #endif
      });
  });
}

Noting that if arrayX_acc[i] += 1.0; is changed to arrayX_acc[i] = arrayX_acc[i] + 1.0; the assertion disappears.

@victor-eds
Copy link
Contributor

I will take on lud

@victor-eds
Copy link
Contributor

I'm looking at the other bmk that fails with an assert (particlefilter). Reduced test case:

#include <CL/sycl.hpp>

using namespace cl::sycl;
constexpr access::mode sycl_read_write = access::mode::read_write;

void likelyhood(int Nparticles) {
  cpu_selector dev_sel;
  queue q(dev_sel);
  const property_list props = property::buffer::use_host_ptr();
  float * arrayX = (float *) calloc(Nparticles, sizeof (float));
  buffer<float, 1>arrayX_GPU(arrayX, Nparticles, props);

  /****************** L I K E L I H O O D ************************************/
  q.submit([&](handler& cgh) {
    auto arrayX_acc = arrayX_GPU.get_access<sycl_read_write>(cgh);

    cgh.parallel_for<class likelihood>(
//      nd_range<1>(range<1>(global_work_size), range<1>(local_work_size)), [=] (nd_item<1> item) {
        nd_range<1>(range<1>(10), range<1>(20)), [=] (nd_item<1> item) {
        int i = item.get_global_linear_id();
        #ifdef BAD
                arrayX_acc[i] += 1.0; 
        #else
          arrayX_acc[i] = arrayX_acc[i] + 1.0; 
        #endif
      });
  });
}

Noting that if arrayX_acc[i] += 1.0; is changed to arrayX_acc[i] = arrayX_acc[i] + 1.0; the assertion disappears.

#7760 fixes this

@victor-eds
Copy link
Contributor

victor-eds commented Dec 13, 2022

lud fixed by #7760 also -> compiles, but fails.

etiotto added a commit that referenced this issue Dec 13, 2022
This PR fixes the assertion described in issue
#7641 in comment
#7641 (comment)

A new test case has been added to prevent regressions. 

Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
@etiotto
Copy link
Author

etiotto commented Dec 13, 2022

lud fixed by #7760 also -> compiles, but fails.

AFAIK lud was compiling without #7760 and was failing to verify at runtime.

@etiotto
Copy link
Author

etiotto commented Dec 13, 2022

Status:

  • compile & run: backprop, gaussian
  • do not compile: particlefilter, streamcluster
  • compile, fail at runtime: lud

@etiotto
Copy link
Author

etiotto commented Dec 13, 2022

New reduced test case for particlefilter:

#include <sycl/sycl.hpp>

using namespace sycl;
constexpr access::mode sycl_write = access::mode::write;
constexpr access::mode sycl_read = access::mode::read;
#ifndef TY
#define TY unsigned char
#endif

void likelyhood(int Nparticles, int IszX, int IszY, int Nfr, int countOnes) {
  queue q;
  buffer<float, 1> A(Nparticles + 1);
  buffer<TY, 1> B(Nparticles + 1);

  q.submit([&](handler &cgh) {
    auto A_acc = A.get_access<sycl_write>(cgh);
    auto B_acc = B.get_access<sycl_read>(cgh);

    cgh.parallel_for<class likelihood>(range<1>(10), [=](item<1> Item) {
      id<1> Id = Item.get_id();
      A_acc[Id] = B_acc[Id];
    });
  });
}

Fails with the following error (noting also that when TY is not char or unsigned char but, for example, is a short the error disappears):

clang++  -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir  -std=c++17 -Wall -DBLOCK_SIZE=256  -O3 ~/projects/tmp/ex_particle_reduced.cpp -w -c 2>&1 | grep error
error: 'sycl.accessor.subscript' op Expecting memref return type. Got '<<NULL TYPE>>'

@victor-eds
Copy link
Contributor

Fails with the following error (noting also that when TY is not char or unsigned char but, for example, is a short the error disappears):

Apparently, compiling pointers to i8 types yields an llvm pointer instead of a memref. We might be having issues with that. IMO, we should generate memref<i8> instead. Related issue: #7767

@etiotto
Copy link
Author

etiotto commented Dec 14, 2022

With fix in PR #7784 particlefilter compiles to the end, and the bmk runs.

@etiotto
Copy link
Author

etiotto commented Dec 14, 2022

Using draft PR #7783 streamcluster no longer asserts in the store operation, the assertion moved to SYCLAccessorSubscriptOp::verify() (because that operation receives a !llvm.ptr<?xstruct....> and it expects a memref). Given that cgeist always represents pointers to struct by using !llvm.ptr, we might want to relax that verification code. @whitneywhtsang will work on that.

I have quickly verified that this is the last compile time issue affecting streamcluster. Commenting out the verification code for SYCLAccessorSubscriptOp yields successful compilation and the bmk runs clean.

etiotto added a commit that referenced this issue Dec 15, 2022
Fixes #7641 (comment) 

Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
@whitneywhtsang
Copy link
Contributor

Created #7802, for the verification issue mentioned in #7641 (comment).

@Pennycook Pennycook unpinned this issue Dec 15, 2022
@whitneywhtsang
Copy link
Contributor

Status:

  • compile and run successfully: backprop, gaussian, particlefilter, streamcluster
  • compile but fail at runtime: lud

@whitneywhtsang
Copy link
Contributor

whitneywhtsang commented Feb 6, 2023

Performance measurements on Intel(R) Iris(R) Xe Graphics:
t
t

@etiotto
Copy link
Author

etiotto commented Feb 6, 2023

Great to see that at -O3 we are mostly ahead of the default clang (without MLIR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sycl-mlir Pull requests or issues for sycl-mlir branch
Projects
None yet
Development

No branches or pull requests

5 participants