Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using XLC compilers for OCCA kernel launchers leads to std::bad_alloc on Summit. #305

Closed
stgeke opened this issue May 28, 2021 · 0 comments · Fixed by #316
Closed

Using XLC compilers for OCCA kernel launchers leads to std::bad_alloc on Summit. #305

stgeke opened this issue May 28, 2021 · 0 comments · Fixed by #316
Labels
bug Something isn't working

Comments

@stgeke
Copy link
Collaborator

stgeke commented May 28, 2021

Backtrace:

(gdb) frame 1
#1  0x00007fff9c081f6c in abort () from /lib64/libc.so.6
(gdb) bt
#0  0x00007fff9c07fbf0 in raise () from /lib64/libc.so.6
#1  0x00007fff9c081f6c in abort () from /lib64/libc.so.6
#2  0x00007fff9c4ad784 in __gnu_cxx::__verbose_terminate_handler ()
    at /sw/summit/gcc/6.4.0/src/gcc-6.4.0/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007fff9c4aa0b4 in __cxxabiv1::__terminate (handler=<optimized out>)
    at /sw/summit/gcc/6.4.0/src/gcc-6.4.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x00007fff9c4aa170 in std::terminate ()
    at /sw/summit/gcc/6.4.0/src/gcc-6.4.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x00007fff9c4aa5b0 in __cxxabiv1::__cxa_throw (obj=0x7fff6c000940, 
    tinfo=0x7fff9c5f6378 <typeinfo for std::bad_alloc>, 
    dest=0x7fff9c4a70f0 <std::bad_alloc::~bad_alloc()>)
    at /sw/summit/gcc/6.4.0/src/gcc-6.4.0/libstdc++-v3/libsupc++/eh_throw.cc:87
#6  0x00007fff9c4aafe4 in operator new (sz=140733193397294)
    at /sw/summit/gcc/6.4.0/src/gcc-6.4.0/libstdc++-v3/libsupc++/new_op.cc:54
#7  0x00007fff9c576a48 in __gnu_cxx::new_allocator<char>::allocate (
    this=<optimized out>, __n=<optimized out>)
    at /sw/summit/gcc/6.4.0/src/objdir/powerpc64le-none-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:104
#8  std::allocator_traits<std::allocator<char> >::allocate (__a=..., 
    __n=<optimized out>)
#9  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create (this=<optimized out>, __capacity=<optimized out>, 
    __old_capacity=<optimized out>)
    at /sw/summit/gcc/6.4.0/src/objdir/powerpc64le-none-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:153
#10 0x00007fff9d8cf5ac in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (this=<optimized out>, 
    __beg=0x7fff9c604f90 <std::string::_Rep::_S_empty_rep_storage+24> "", 
    __end=<optimized out>)
    at /autofs/nccs-svm1_sw/summit/gcc/6.4.0/include/c++/6.4.0/bits/basic_string.tcc:219
#11 0x00007fff9d8cf70c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char*> (__end=<optimized out>, 
    __beg=<optimized out>, this=0x7fffcef45f38)
    at /autofs/nccs-svm1_sw/summit/gcc/6.4.0/include/c++/6.4.0/bits/basic_string.h:196
#12 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (__end=<optimized out>, __beg=<optimized out>, 
    this=0x7fffcef45f38)
    at /autofs/nccs-svm1_sw/summit/gcc/6.4.0/include/c++/6.4.0/bits/basic_string.h:215
#13 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__str=..., this=0x7fffcef45f38)
    at /autofs/nccs-svm1_sw/summit/gcc/6.4.0/include/c++/6.4.0/bits/basic_string.h:400
#14 occa::primitive::primitive (p=..., this=0x7fffcef45f30)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/include/occa/types/primitive.hpp:87
#15 occa::kernelArgData::kernelArgData (this=0x7fffcef45f30, value_=...)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/core/kernelArg.cpp:20
#16 0x00007fff92233724 in gather_doubleAdd ()
   from /gpfs/alpine/scratch/malachi/csc262/eddyPeriodic/.cache/occa/cache/c57dc538c19db0ab/launcher_binary
#17 0x00007fff9da909c8 in occa::sys::runFunction(void (*)(...), int, void**) (
    f=<optimized out>, argc=<optimized out>, args=<optimized out>)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/occa/internal/utils/runFunction.cpp_codegen:28
#18 0x00007fff9da71938 in occa::serial::kernel::run (this=0x4e41e380)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/occa/internal/modes/serial/kernel.cpp:54
#19 0x00007fff9d935ab0 in occa::launchedModeKernel_t::launcherRun (
    this=0x4e36a150)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/
#20 0x00007fff9d935bc4 in occa::launchedModeKernel_t::run (
    this=<optimized out>)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/occa/internal/core/launchedKernel.cpp:33
#21 0x00007fff9d8c1e30 in occa::kernel::run (
    this=0x7fff9debd7c8 <ogs::gatherKernel_doubleAdd>)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/core/kernel.cpp:180
#22 0x00007fff9d8c25cc in occa::kernel::operator() (
    this=0x7fff9debd7c8 <ogs::gatherKernel_doubleAdd>, arg1=..., arg2=..., 
    arg3=..., arg4=..., arg5=...)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/occa/src/core/kernelOperators.cpp_codegen:53
#23 0x00007fff9dd092c0 in occaGather (Ngather=<optimized out>, 
    o_gatherStarts=..., o_gatherIds=..., type=<optimized out>, 
    op=<optimized out>, o_v=..., o_gv=...)
    at /ccs/home/malachi/develop/most-recent-nekrs/3rd_party/gslib/ogs/src/ogsGather.cpp:356
#24 0x00007fff9dd0b600 in ogsGatherFinish (o_gv=..., o_v=..., 
    type=0x7fff9de54060 "double", op=0x7fff9de4b1a8 "add", ogs=0x4cd893f8)

The issue seems to be related to our recent update to the latest occa version. However a simple occa example seems to work.

@stgeke stgeke added the bug Something isn't working label May 28, 2021
@stgeke stgeke pinned this issue May 28, 2021
MalachiTimothyPhillips added a commit to MalachiTimothyPhillips/nekRS that referenced this issue Jun 4, 2021
…her compilation.

This allows the current nekRS/master to run on Summit.
@MalachiTimothyPhillips MalachiTimothyPhillips changed the title Code throws 'std::bad_alloc' on Summit Using XLC compilers for OCCA kernel launchers leads to std::bad_alloc on Summit. Jun 4, 2021
MalachiTimothyPhillips added a commit to MalachiTimothyPhillips/nekRS that referenced this issue Jun 6, 2021
@stgeke stgeke unpinned this issue Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant