Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParticleCache_test fails on Fedora 31 #2985

Closed
junghans opened this issue Jul 10, 2019 · 41 comments
Closed

ParticleCache_test fails on Fedora 31 #2985

junghans opened this issue Jul 10, 2019 · 41 comments

Comments

@junghans
Copy link
Member

23/41 Test #23: ParticleCache_test ...............***Failed    2.65 sec
Running 7 test cases...
Running 7 test cases...
unknown location(0): �[4;31;49mfatal error: in "update": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag
�[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(120): �[1;36;49mlast checkpoint�[0;39;49m
unknown location(0): �[4;31;49mfatal error: in "update_with_bonds": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_RANK: invalid rank
�[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(129): �[1;36;49mlast checkpoint: "update_with_bonds" test entry�[0;39;49m
�[1;31;49m*** 3 failures are detected in the test module "ParticleCache test"
unknown location(0): �[4;31;49mfatal error: in "iterators": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag
�[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(195): �[1;36;49mlast checkpoint�[0;39;49m
�[0;39;49m
�[1;32;49m*** No errors detected
�[0;39;49m[1562754342.323155] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7ea5afc0 was not returned to mpool ucp_am_bufs
[1562754342.323260] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7df928e0 was not returned to mpool mm_recv_desc
[1562754342.323273] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7df94960 was not returned to mpool mm_recv_desc
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
  Process name: [[57821,1],0]
  Exit code:    201
--------------------------------------------------------------------------

Reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1728057
Build logs here: https://koji.fedoraproject.org/koji/taskinfo?taskID=36164517

This only seems to be an issue on 64-bit archs, i.e. x86_64, aarch64 and ppc64le, the other archs (i686, s390x, armv7hl) pass.

@KaiSzuttor KaiSzuttor added this to the Espresso 4.1 milestone Jul 15, 2019
@jngrad
Copy link
Member

jngrad commented Jul 19, 2019

Note: the fix will have to be applied to both 4.0 and 4.1 (devel) branches.

@junghans
Copy link
Member Author

@jngrad will there a patch release, otherwise can you point be at the patch, so I can include it in the rpm?

@jngrad
Copy link
Member

jngrad commented Jul 19, 2019

Unfortunately, there is no patch yet. Since this ticket was assigned to the 4.1 release, I left a reminder that any fix will have to be cherry-picked into the 4.0 branch.

@jngrad
Copy link
Member

jngrad commented Jul 19, 2019

I can reproduce the error message in Fedora 31 (replacing FROM fedora:latest to FROM fedora:31 in centos-python3/Dockerfile-next) for both 4.0 and devel, on x86_64 (full message) and ppc64le (truncated message). The failing tests are ParticleCache_test on 4.0 and MpiCallbacks_test, ParallelScriptInterface_test and ParticleCache_test on devel. They all involve an issue with boost::mpi. This issue is not reproducible in Fedora 30 (docker image centos-python3:next). Note: Fedora 31 release cycle.

@junghans
Copy link
Member Author

As we did for #2507, we could patch boost::mpi in F31, if you have a fix.

@jngrad
Copy link
Member

jngrad commented Jul 20, 2019

When running ParticleCache_test without MPI, I get a different error message:

unknown location(0): fatal error: in "update": std::out_of_range: _Map_base::at
/home/espresso/espresso402/espresso/src/core/unit_tests/ParticleCache_test.cpp(99): last checkpoint: "update" test entry

I recompiled espresso in a new container of the same image but with --cap-add=SYS_PTRACE and GDB installed, and this test did not fail anymore without MPI or with MPI on a single node:

[espresso@24ee726a8a1c build]$ src/core/unit_tests/ParticleCache_test
Running 7 test cases...

*** No errors detected
[espresso@24ee726a8a1c build]$ /usr/lib64/openmpi/bin/mpiexec -n 1 src/core/unit_tests/ParticleCache_test
Running 7 test cases...

*** No errors detected

@KaiSzuttor
Copy link
Member

so UB

@mkuron
Copy link
Member

mkuron commented Jul 20, 2019

F30 and F31 actually have the same Boost version. So it must be due to different compiler or MPI versions.

@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Jul 22, 2019 via email

@jngrad
Copy link
Member

jngrad commented Jul 22, 2019

@mkuron Fedora 31 uses OpenMPI 4.0.1 while Fedora 30 + Ubuntu <= eoan use <= 3.1.4

@jngrad
Copy link
Member

jngrad commented Jul 24, 2019

@KaiSzuttor We probably should de-milestone this, OpenMPI 4 is not fully backward-compatible with OpenMPI 3 source code. It is unclear to me how much work will be involved to support both v3 and v4 in our codebase, nor when our user base will start the transition to v4.

@mkuron
Copy link
Member

mkuron commented Jul 24, 2019

OpenMPI 4 is not fully backward-compatible with OpenMPI 3 source code.

How is that possible? MPI is a standard. We also support other implementations like MPICH (tested in the Intel container). If OpenMPI does not comply to the standard, we won't be able to support it, but I am pretty sure that it is compliant.

@jngrad
Copy link
Member

jngrad commented Jul 24, 2019

This is how I interpreted the language in Open MPI: Version Number Methodology:

Major: The major number is the first integer in the version string
Changes in the major number typically indicate a significant
change in the code base and/or end-user functionality, and also
indicate a break from backwards compatibility. Specifically: Open
MPI releases with different major version numbers are not
backwards compatibale with each other.

If the issue at hand is indeed a bug in our codebase we should address it, but if it comes from an API change in OpenMPI it will be harder to estimate the amount of efforts to invest into the v4 migration.

@mkuron
Copy link
Member

mkuron commented Jul 24, 2019

This is a bit misleading as it is about ABI compatibility and not about API compatibility. This means you cannot combine libmpi.so and mpi.h from different major versions, only from different minor versions. API compatibility is guaranteed between all MPI implementations (MPICH, OpenMPI 3, OpenMPI 4, ...).

@junghans
Copy link
Member Author

Any news on this?

@jngrad
Copy link
Member

jngrad commented Aug 3, 2019

Tracing the source of the failure got me to this line in the code, which only crashes on mpi rank 0:

boost::mpi::reduce(m_cb.comm(), remote_parts, remote_parts,
detail::Merge<map_type, detail::IdCompare>(), 0);

The boost::container::flat_set<Particle, detail::IdCompare> remote_parts object has correct iterators, and when using a temporary object as the output of boost::mpi::reduce to exclude any possibility of iterator invalidation, the issue persists (using 10 particles instead of 10000):

  /* Reduce data to the master by merging the flat_sets from
   * the nodes in a reduction tree. */
  fprintf(stderr, "%d: before >> ", m_cb.comm().rank());
  for(auto &p : remote_parts) {
    fprintf(stderr, "%d ", p.identity());
  }
  fprintf(stderr, "<<\n");

  if (m_cb.comm().rank() == 0) {
    map_type remote_parts_tmp{};
    boost::mpi::reduce(m_cb.comm(), remote_parts, remote_parts_tmp,
                       detail::Merge<map_type, detail::IdCompare>(), 0);
    remote_parts = remote_parts_tmp;
  } else {
    boost::mpi::reduce(m_cb.comm(), remote_parts, remote_parts,
                       detail::Merge<map_type, detail::IdCompare>(), 0);
  }

  fprintf(stderr, "%d: after >> ", m_cb.comm().rank());
  for(auto &p : remote_parts) {
    fprintf(stderr, "%d ", p.identity());
  }
  fprintf(stderr, "<<\n");
}
Test project /home/espresso/build-debug/src/core/unit_tests
    Start 1: ParticleCache_test
1/1 Test #1: ParticleCache_test ...............***Failed    2.61 sec
Running 1 test case...
Running 1 test case...
0: before >> 0 1 2 3 4 5 6 7 8 9 <<
1: before >> 10 11 12 13 14 15 16 17 18 19 <<
1: after >> 10 11 12 13 14 15 16 17 18 19 <<

*** No errors detected

*** 1 failure is detected in the test module "ParticleCache test"
unknown location(0): fatal error: in "update": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag

In a non-failing environment (OpenMPI 3.1, Fedora 30), using the same boost version (1.69.0):

0: before >> 0 1 2 3 4 5 6 7 8 9 <<
1: before >> 10 11 12 13 14 15 16 17 18 19 <<
1: after >> 10 11 12 13 14 15 16 17 18 19 <<
0: after >> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 <<

In Fedora 30, I was finally able to reproduce the same bug (occurs at the same line in espresso/src/core/ParticleCache.hpp), however it simply produced warnings instead of causing a crash:

ctest -V -R ParticleCache_test
25: Test command: /usr/lib64/openmpi/bin/mpiexec "-oversubscribe" "-n" "2" "/home/espresso/build/src/core/unit_tests/ParticleCache_test"
25: Test timeout computed to be: 10000000
25: Running 7 test cases...
25: Running 7 test cases...
25: [5bbdcbf2fb5a:06849] Read -1, expected 80012, errno = 1
25: [5bbdcbf2fb5a:06849] Read -1, expected 9884, errno = 1
25: [5bbdcbf2fb5a:06849] Read -1, expected 40308, errno = 1
25: [5bbdcbf2fb5a:06849] Read -1, expected 8012, errno = 1
25: 
25: *** No errors detected
25: 
25: *** No errors detected
25: 
1/1 Test #25: ParticleCache_test ...............   Passed    0.64 sec

It is not fully reproducible and sometimes requires 4 threads (mpirun -n 4 src/core/unit_tests/ParticleCache_test), and when adding printf statements in the code these warnings sometimes disappear. These warnings were never displayed during unit testing because CTest hides stdout/stderr when a test passes. These Read -1, expected <number>, errno = 1 warnings are documented for OpenMPI running inside Docker (#2271, open-mpi/ompi#4948) and have multiple workarounds:

  • mpirun --mca btl ^vader -n 4 src/core/unit_tests/ParticleCache_test
  • mpirun --mca btl_vader_single_copy_mechanism none -n 4 src/core/unit_tests/ParticleCache_test
  • export OMPI_MCA_btl_vader_single_copy_mechanism=none; mpirun -n 4 src/core/unit_tests/ParticleCache_test
  • starting the docker container with option --cap-add=SYS_PTRACE

But in OpenMPI v4, the test doesn't show these warnings anymore, instead the test crashes and shows a list of new warnings. Only the --cap-add=SYS_PTRACE solution prevents this crash. Either the OpenMPI-in-Docker issue was fixed in v4 and we really have a bug in espresso, or the behavior of OpenMPI v4 concerning warnings was changed to crash the program and show a warnings report.

@mkuron
Copy link
Member

mkuron commented Aug 4, 2019

The vader issue is not limited to Docker; #2271 saw it on a desktop computer. However, since @junghans‘s build logs do not contain any Read -1, expected <number>, errno = 1, they are caused by something else. Also, Fedora‘s build service runs on KVM and thus does not limit ptrace, so there is no need for added privileges. We do grant them on our CI though.

@jngrad
Copy link
Member

jngrad commented Aug 4, 2019

Bug not reproducible on Fedora 31 with MPICH.
Bug not reproducible on Ubuntu 18.0 with boost 1.69 and OpenMPI 3.1.
This seems to be OpenMPI v4-specific. I can't get a backtrace in gdb:

mpiexec -n 1 gdb src/core/unit_tests/ParticleCache_test : -n 1 src/core/unit_tests/ParticleCache_test
GNU gdb (GDB) Fedora 8.3.50.20190702-20.fc31
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/core/unit_tests/ParticleCache_test...
(gdb) run
Starting program: /home/espresso/build/src/core/unit_tests/ParticleCache_test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff6c79700 (LWP 2765)]
[New Thread 0x7ffff62aa700 (LWP 2766)]
[New Thread 0x7fffeffff700 (LWP 2767)]
Running 1 test case...
0: before >> 0 1 2 3 4 5 6 7 8 9 <<
Running 1 test case...
1: before >> 10 11 12 13 14 15 16 17 18 19 <<
1: after >> 10 11 12 13 14 15 16 17 18 19 <<
unknown location(0): fatal error: in "update": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag

/local/es/src/core/unit_tests/ParticleCache_test.cpp(145): last checkpoint

*** No errors detected

*** 1 failure is detected in the test module "ParticleCache test"
[1564932286.293320] [ded9866126f7:2761 :0]          mpool.c:37   UCX  WARN  object 0x7ffff467dee0 was not returned to mpool mm_recv_desc
[Thread 0x7fffeffff700 (LWP 2767) exited]
[Thread 0x7ffff62aa700 (LWP 2766) exited]
[Thread 0x7ffff6c79700 (LWP 2765) exited]
[Inferior 1 (process 2761) exited with code 0311]
Missing separate debuginfos, use: dnf debuginfo-install zlib-1.2.11-16.fc31.x86_64
(gdb) bt
No stack.

@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Aug 4, 2019 via email

@jngrad
Copy link
Member

jngrad commented Aug 4, 2019

Thanks! We can finally work on something:

(gdb) catch throw
Catchpoint 1 (throw)
(gdb) run
Starting program: /home/espresso/build/src/core/unit_tests/ParticleCache_test 
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.30-1.fc31.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff6c79700 (LWP 2765)]
[New Thread 0x7ffff62aa700 (LWP 2766)]
[New Thread 0x7fffeffff700 (LWP 2767)]
Running 1 test case...
0: before >> 0 1 2 3 4 5 6 7 8 9 <<
Running 1 test case...
1: before >> 10 11 12 13 14 15 16 17 18 19 <<
1: after >> 10 11 12 13 14 15 16 17 18 19 <<

Thread 1 "ParticleCache_t" hit Catchpoint 1 (exception thrown), 0x00007ffff75e4a22 in __cxxabiv1::__cxa_throw (obj=0x61db90, 
    tinfo=0x474dd8 <typeinfo for boost::wrapexcept<boost::mpi::exception>>, 
    dest=0x45325c <boost::wrapexcept<boost::mpi::exception>::~wrapexcept()>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:78
78	  PROBE2 (throw, obj, tinfo);
Missing separate debuginfos, use: dnf debuginfo-install zlib-1.2.11-16.fc31.x86_64

(gdb) bt  
#0  0x00007ffff75e4a22 in __cxxabiv1::__cxa_throw (obj=0x61db90, 
    tinfo=0x474dd8 <typeinfo for boost::wrapexcept<boost::mpi::exception>>, 
    dest=0x45325c <boost::wrapexcept<boost::mpi::exception>::~wrapexcept()>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:78
#1  0x0000000000450f50 in boost::throw_exception<boost::mpi::exception> (e=...)
    at /usr/include/boost/throw_exception.hpp:70
#2  0x00007ffff78776e8 in boost::mpi::detail::packed_archive_recv (
    comm=0x7ffff79a1b60 <ompi_mpi_comm_world>, source=<optimized out>, 
    tag=<optimized out>, ar=..., status=...)
    at libs/mpi/src/point_to_point.cpp:93
#3  0x000000000045b76b in boost::mpi::detail::tree_reduce_impl<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::Merge<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::IdCompare> > (comm=..., 
    in_values=0x7fffffffb370, n=1, out_values=0x7fffffffb0d0, op=..., root=0)
    at /usr/include/boost/mpi/collectives/reduce.hpp:134
#4  0x0000000000459338 in boost::mpi::detail::reduce_impl<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::Merge<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::IdCompare> > (comm=..., 
    in_values=0x7fffffffb370, n=1, out_values=0x7fffffffb0d0, op=..., root=0)
    at /usr/include/boost/mpi/collectives/reduce.hpp:292
#5  0x000000000045724a in boost::mpi::reduce<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::Merge<boost::container::flat_set<Particle, detail::IdCompare, boost::container::new_allocator<Particle> >, detail::IdCompare> > (comm=..., in_value=..., out_value=..., op=..., root=0) at /usr/include/boost/mpi/collectives/reduce.hpp:358
#6  0x000000000044a2d4 in ParticleCache<update::test_method()::<lambda()>, Utils::NoOp, const std::vector<Particle, std::allocator<Particle> >, Particle>::m_update(void) (this=0x7fffffffb330) at /local/es/src/core/ParticleCache.hpp:259
#7  0x000000000044aa1f in ParticleCache<update::test_method()::<lambda()>, Utils::NoOp, const std::vector<Particle, std::allocator<Particle> >, Particle>::update(void) (this=0x7fffffffb330) at /local/es/src/core/ParticleCache.hpp:408
#8  0x0000000000449c0a in ParticleCache<update::test_method()::<lambda()>, Utils::NoOp, const std::vector<Particle, std::allocator<Particle> >, Particle>::size(void) (this=0x7fffffffb330) at /local/es/src/core/ParticleCache.hpp:428
#9  0x0000000000449510 in update::test_method (this=0x7fffffffb5ef) at /local/es/src/core/unit_tests/ParticleCache_test.cpp:145
#10 0x0000000000449026 in update_invoker () at /local/es/src/core/unit_tests/ParticleCache_test.cpp:99
#11 0x0000000000458ecf in boost::detail::function::void_function_invoker0<void (*)(), void>::invoke (function_ptr=...) at /usr/include/boost/function/function_template.hpp:117
#12 0x00007ffff7766582 in boost::function0<void>::operator() (this=<optimized out>) at ./boost/function/function_template.hpp:677
#13 boost::detail::forward::operator() (this=<optimized out>) at ./boost/test/impl/execution_monitor.ipp:1312
#14 boost::detail::function::function_obj_invoker0<boost::detail::forward, int>::invoke (function_obj_ptr=...) at ./boost/function/function_template.hpp:137
#15 0x00007ffff77655ed in boost::function0<int>::operator() (this=0x7fffffffcaa0) at ./boost/function/function_template.hpp:677
#16 boost::detail::do_invoke<boost::shared_ptr<boost::detail::translator_holder_base>, boost::function<int ()> >(boost::shared_ptr<boost::detail::translator_holder_base> const&, boost::function<int ()> const&) (F=..., tr=...) at ./boost/test/impl/execution_monitor.ipp:286
#17 boost::execution_monitor::catch_signals(boost::function<int ()> const&) (this=0x7ffff77e1f80 <boost::unit_test::unit_test_monitor_t::instance()::the_inst>, F=...) at ./boost/test/impl/execution_monitor.ipp:875
#18 0x00007ffff7765678 in boost::execution_monitor::execute(boost::function<int ()> const&) (this=0x7ffff77e1f80 <boost::unit_test::unit_test_monitor_t::instance()::the_inst>, F=...) at ./boost/test/impl/execution_monitor.ipp:1214
#19 0x00007ffff776574e in boost::execution_monitor::vexecute(boost::function<void ()> const&) (this=this@entry=0x7ffff77e1f80 <boost::unit_test::unit_test_monitor_t::instance()::the_inst>, F=...) at /usr/include/c++/9/new:174
#20 0x00007ffff778f99f in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::function<void ()> const&, unsigned int) (this=0x7ffff77e1f80 <boost::unit_test::unit_test_monitor_t::instance()::the_inst>, func=..., timeout=timeout@entry=0) at ./boost/test/impl/unit_test_monitor.ipp:49
#21 0x00007ffff7775680 in boost::unit_test::framework::state::execute_test_tree (this=this@entry=0x7ffff77e1ae0 <boost::unit_test::framework::impl::(anonymous namespace)::s_frk_state()::the_inst>, tu_id=tu_id@entry=65536, timeout=0, p_random_generator=p_random_generator@entry=0x7fffffffcce0) at ./boost/test/utils/class_properties.hpp:58
#22 0x00007ffff7775914 in boost::unit_test::framework::state::execute_test_tree (this=0x7ffff77e1ae0 <boost::unit_test::framework::impl::(anonymous namespace)::s_frk_state()::the_inst>, tu_id=tu_id@entry=1, timeout=timeout@entry=0, p_random_generator=p_random_generator@entry=0x0) at ./boost/test/impl/framework.ipp:737
#23 0x00007ffff776cc74 in boost::unit_test::framework::run (id=1, id@entry=4294967295, continue_test=continue_test@entry=true) at ./boost/test/impl/framework.ipp:1631
#24 0x00007ffff778e822 in boost::unit_test::unit_test_main (init_func=<optimized out>, argc=<optimized out>, argv=<optimized out>) at ./boost/test/impl/unit_test_main.ipp:247
#25 0x00000000004498c7 in main (argc=1, argv=0x7fffffffd118) at /local/es/src/core/unit_tests/ParticleCache_test.cpp:250

@RudolfWeeber
Copy link
Contributor

It is our impression that this is caused by an incompatibility between Opnmpi 4 and boost-mpi. We're giving up for now.

@KaiSzuttor KaiSzuttor removed this from the Espresso 4.1 milestone Aug 9, 2019
@mkuron
Copy link
Member

mkuron commented Aug 9, 2019

We need a minimum working example without Espresso so we can file a bug with OpenMPI or Boost.MPI as appropriate. Otherwise this bug is going to haunt us once Ubuntu 20.04 LTS ships with OpenMPI 4.0...

@jngrad
Copy link
Member

jngrad commented Aug 16, 2019

@mkuron I was able to factor out the boost unit test logic, most of the Particle structure and most of the ParticleCache logic while preserving the backtrace in a MWE, but it still has the following dependencies:

src/core/MpiCallbacks.hpp
src/core/MpiCallbacks.cpp
src/core/utils/NumeratedContainer.hpp
src/core/utils/make_unique.hpp
src/core/utils/serialization/flat_set.hpp

I'm not familiar enough with our MpiCallbacks class to reduce it further.

@jngrad
Copy link
Member

jngrad commented Aug 19, 2019

@mkuron finally got the MpiCallbacks class out. The MWE is ~150 lines long and produces the error message of the first failing test, i.e. BOOST_AUTO_TEST_CASE(update). See openmpi-sample.tar.gz

@mkuron
Copy link
Member

mkuron commented Aug 19, 2019

That‘s still quite a lot of code... is there any chance you could reduce it further? I still have no idea where the issue might be coming from.

@jngrad
Copy link
Member

jngrad commented Aug 19, 2019

Well the first overloaded function of boost::mpi::reduce requires a commutative merge operator (class Merge, struct is_commutative) for a serializable container (boost::container::flat_set, void load, void save, void serialize) of a serializable orderable class (class Particle, class IdCompare). There's nothing else in the MWE.

@mkuron
Copy link
Member

mkuron commented Aug 19, 2019

So do we now know whether it‘s OpenMPI‘s or Boost.MPI‘s fault? If not, we should probably reach out to Boost.MPI first and open an issue with them.

@jngrad
Copy link
Member

jngrad commented Aug 19, 2019

I still can't tell. Manually following the sample.cpp GDB trace (file followed by the corresponding code):

/usr/include/boost/mpi/collectives/reduce.hpp:292
  detail::tree_reduce_impl(comm, in_values, n, out_values, op, root, is_commutative<Op, T>());
/usr/include/boost/mpi/collectives/reduce.hpp:134
  detail::packed_archive_recv(comm, child, tag, ia, status);
libs/mpi/src/point_to_point.cpp:93
  BOOST_MPI_CHECK_RESULT(MPI_Recv, (ar.address(), count, MPI_PACKED, status.MPI_SOURCE, status.MPI_TAG, comm, &status));
/usr/include/boost/mpi/exception.hpp:100
  boost::throw_exception(boost::mpi::exception(#MPIFunc, _check_result));
/usr/include/boost/throw_exception.hpp:70
  throw boost::exception_detail::enable_both( e );
/usr/include/boost/exception/exception.hpp:517
  return boost::exception_detail::wrapexcept<typename remove_error_info_injector<boost::mpi::exception>::type>( enable_error_info( x ) );
/usr/include/boost/exception/exception.hpp:486
  boost::exception_detail::clone_impl<typename exception_detail::enable_error_info_return_type<remove_error_info_injector<boost::mpi::exception>::type>::type> ( x ) {}
/usr/include/boost/exception/exception.hpp:436
  boost::exception_detail::copy_boost_exception(x)
/* after this point it looks really complex, probably a setup for std::exception */

It seemed the throw was triggered by the return value MPI_ERR_TAG of the call to MPI_Recv(). According to the MPICH implementation:

MPI_ERR_TAG
Invalid tag argument. Tags must be non-negative; tags in a receive (MPI_Recv, MPI_Irecv, MPI_Sendrecv, etc.) may also be MPI_ANY_TAG. The largest tag value is available through the the attribute MPI_TAG_UB.

Modifying /usr/include/boost/mpi/collectives/reduce.hpp to printf the tag shows:

  • on Fedora 30: 2147483647, aka 2^31-1, which is positive on a 32bit signed int
  • on Fedora 31: 8388608, aka 2^23, which would be -1 on a 24bit signed int (if this type existed)

Both of these values are equal to MPI_TAG_UB, so that was a dead end.

Adding a printf in BOOST_MPI_CHECK_RESULT to show which MPI functions are called reveals that MPI_Recv never enters the macro, only MPI_Free_mem and MPI_Alloc_mem do, and always return MPI_SUCCESS. Removing the throw statement from the macro still generated the same GDB trace, although the filepath /usr/include/boost/throw_exception.hpp in the backtrace changed to ./boost/throw_exception.hpp (this file does not exist), suggesting that GDB is not processing the symbols correctly, despite the -O0 -g flags. Compiling with -DBOOST_EXCEPTION_DISABLE also didn't change the error message.

The authors of Boost.MPI will probably be able to interpret the GDB backtrace.

@mkuron
Copy link
Member

mkuron commented Aug 20, 2019

Try reporting it at https://github.com/boostorg/mpi/issues. Tell them that we are unsure whether this is due to Boost.MPI incorrectly using MPI or OpenMPI 4.0 violating the MPI standard.

@jngrad
Copy link
Member

jngrad commented Aug 20, 2019

New datapoint: the error is not reproducible on Ubuntu 18.04 with OpenMPI 4.0.1 and boost 1.69.
Furthermore, the issue is independent of our merge operator and flat_set container: the second example with std::string in boost tutorial/reduce produces the same error.

I now think the issue comes from Fedora 31. For example the MPI_TAG_UB is equal to 2^23 only on F31, not on Ubuntu with OpenMPI 4.0.1 or on other OSes with previous versions of OpenMPI. So I compiled OpenMPI 4.0.1 and boost 1.69 from sources on Fedora 31, and now MPI_TAG_UB has the correct value of 2^31-1 and all espresso 4.0.2 core unit tests and python integration tests pass.

I can't tell for sure which of the openmpi-devel and boost-openmpi-devel packages has the bug since building boost 1.69 from sources with openmpi-devel installed fails to generate the mpi part of boost. The incorrect value for MPI_TAG_UB comes from the openmpi-devel package.

@junghans if this can be of any help, here is my Dockerfile with the OpenMPI build procedure. I didn't look into compiling hdf5-openmpi-devel from sources. I'm using --nogpgcheck because the F31 GPG key seems to have been invalidated a couple of weeks ago, with this error message: The GPG keys listed for the "Fedora - Rawhide - Developmental packages for the next Fedora release" repository are already installed but they are not correct for this package.. According to dnf --refresh install fedora-gpg-keys the x86_64 GPG key is already up-to-date, and I confirmed it manually with an RPM.

@mkuron
Copy link
Member

mkuron commented Aug 20, 2019

Where does the value of MPI_TAG_UB come from? It seems like it can be set from the outside too: https://github.com/open-mpi/ompi/blob/7962a8e40b132172488c8f3a38f531af44097b76/ompi/attribute/attribute_predefined.c#L132

@jngrad
Copy link
Member

jngrad commented Aug 20, 2019

I'm using code extracted from boost::mpi::environment::max_tag()

@mkuron
Copy link
Member

mkuron commented Aug 20, 2019

Sure, but how does OpenMPI decide what value to use?

@junghans
Copy link
Member Author

New datapoint: the error is not reproducible on Ubuntu 18.04 with OpenMPI 4.0.1 and boost 1.69.
Furthermore, the issue is independent of our merge operator and flat_set container: the second example with std::string in boost tutorial/reduce produces the same error.

I now think the issue comes from Fedora 31. For example the MPI_TAG_UB is equal to 2^23 only on F31, not on Ubuntu with OpenMPI 4.0.1 or on other OSes with previous versions of OpenMPI. So I compiled OpenMPI 4.0.1 and boost 1.69 from sources on Fedora 31, and now MPI_TAG_UB has the correct value of 2^31-1 and all espresso 4.0.2 core unit tests and python integration tests pass.

I can't tell for sure which of the openmpi-devel and boost-openmpi-devel packages has the bug since building boost 1.69 from sources with openmpi-devel installed fails to generate the mpi part of boost. The incorrect value for MPI_TAG_UB comes from the openmpi-devel package.

@junghans if this can be of any help, here is my Dockerfile with the OpenMPI build procedure. I didn't look into compiling hdf5-openmpi-devel from sources. I'm using --nogpgcheck because the F31 GPG key seems to have been invalidated a couple of weeks ago, with this error message: The GPG keys listed for the "Fedora - Rawhide - Developmental packages for the next Fedora release" repository are already installed but they are not correct for this package.. According to dnf --refresh install fedora-gpg-keys the x86_64 GPG key is already up-to-date, and I confirmed it manually with an RPM.

@opoplawski, any idea what is special about rawhide's openmpi package.

@junghans
Copy link
Member Author

@jngrad Do you guys have a mini-reproducer I can give to Fedora's openmpi maintainer?

@junghans
Copy link
Member Author

As the problem is Fedora specific just open a bug report here: https://bugzilla.redhat.com/

@jngrad
Copy link
Member

jngrad commented Aug 21, 2019

We're currently unsure from which package the issue comes from. The GDB backtrace is incomplete, and when I tried to fill in the gaps by manually inspecting the boost::mpi header files, I ended up with a path that wasn't actually visited, because commenting out parts of that path had no effect on the GDB backtrace, excepted for showing inexistent filenames. Without a complete backtrace, it's difficult to find which part of boost::mpi is calling MPI_Recv with a problematic tag.

@pkovacs
Copy link

pkovacs commented Aug 29, 2019

I discovered that the issue is somewhere in the ucx support of openmpi. If you rebuild openmpi without ucx, the problem goes away and the tag reports the correct value, as I mentioned on the Fedora bug report

@hjelmn
Copy link

hjelmn commented Aug 29, 2019

Not a bug in Open MPI. There are no guarantees in the MPI standard on what the tag ub is. Any code using MPI must use a tag below the tag ub.

@jngrad
Copy link
Member

jngrad commented Aug 29, 2019

MPI tag issue reported on Red Hat Bugzilla under Bug 1746564. The root cause was an incorrect value for MPI_TAG_UB in the Open MPI 4.0.1 library which caused integer overflow in UCX. It has been fixed in the upcoming Open MPI 4.0.2 release. Bug 1728057 has been notified. I'm closing this issue.

@jngrad jngrad closed this as completed Aug 29, 2019
@hjelmn
Copy link

hjelmn commented Aug 29, 2019

Yup. pml/ucx had a bug. Sorry for the noise. Max tag is something that should be covered by our tests but apparently not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants