Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue1445.Collision] Regression test crashes #1769

Closed
oysstu opened this issue Oct 12, 2023 · 7 comments · Fixed by #1778
Closed

[Issue1445.Collision] Regression test crashes #1769

oysstu opened this issue Oct 12, 2023 · 7 comments · Fixed by #1778
Assignees
Labels
type: bug Indicates an unexpected problem or unintended behavior

Comments

@oysstu
Copy link

oysstu commented Oct 12, 2023

Bug Report

  • [X ] I checked the documentation but found no answer.
  • [X ] I checked to make sure that this issue has not already been filed.

Environment

  • DART version: v13.6.0
  • OS name and version name(or number): Archlinux
  • Compiler name and version number: GNU/GCC 13.2.1

Current Behavior

The regression test for #1445 is failing for me on v6.13.0. It is also an issue when using libdart from gazebo harmonic and garden.

[oysstu@os-t14 build]$ ./unittests/regression/test_Issue1445
Running main() from /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest_main.cc
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Issue1445
[ RUN      ] Issue1445.Collision
/usr/include/c++/13.2.1/bits/stl_vector.h:1125: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = dart::dynamics::Skeleton::DataCache; _Alloc = Eigen::aligned_allocator<dart::dynamics::Skeleton::DataCache>; reference = dart::dynamics::Skeleton::DataCache&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Aborted (core dumped)

Backtrace in gdb:

(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007ffff74ac8a3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007ffff745c668 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff74444b8 in __GI_abort () at abort.c:79
#4  0x00007ffff68dd3b2 in std::__glibcxx_assert_fail (file=file@entry=0x7ffff7c50488 "/usr/include/c++/13.2.1/bits/stl_vector.h", line=line@entry=1125, 
    function=function@entry=0x7ffff7c76c30 "std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = dart::dynamics::Skeleton::DataCache; _Alloc = Eigen::aligned_allocator<dart::dynamics::Skeleton::DataCac"..., condition=condition@entry=0x7ffff7c4c1cf "__n < this->size()") at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:61
#5  0x00007ffff7a1ccc7 in std::vector<dart::dynamics::Skeleton::DataCache, Eigen::aligned_allocator<dart::dynamics::Skeleton::DataCache> >::operator[] (__n=<optimized out>, 
    this=<optimized out>) at /usr/include/c++/13.2.1/bits/stl_vector.h:1125
#6  dart::dynamics::BodyNode::dirtyTransform (this=0x55555563c140) at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/BodyNode.cpp:1516
#7  0x00007ffff7a4e822 in dart::dynamics::Entity::changeParentFrame (this=this@entry=0x55555563cec8, _newParentFrame=_newParentFrame@entry=0x555555631540)
    at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/Entity.cpp:263
#8  0x00007ffff7a6d48f in dart::dynamics::Frame::changeParentFrame (this=<optimized out>, _newParentFrame=0x555555631540) at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/Frame.cpp:600
#9  0x00007ffff7b686bd in dart::dynamics::Skeleton::moveBodyNodeTree (this=this@entry=0x555555639e60, _parentJoint=_parentJoint@entry=0x555555640200, _bodyNode=<optimized out>, 
    _bodyNode@entry=0x55555563c140, _newSkeleton=std::shared_ptr<dart::dynamics::Skeleton> (use count 10, weak count 5) = {...}, _parentNode=_parentNode@entry=0x555555630940)
    at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/Skeleton.cpp:2654
#10 0x000055555559bc65 in dart::dynamics::Skeleton::moveBodyNodeTree<dart::dynamics::WeldJoint> (this=this@entry=0x555555639e60, _bodyNode=_bodyNode@entry=0x55555563c140, 
    _newSkeleton=std::shared_ptr<dart::dynamics::Skeleton> (use count 10, weak count 5) = {...}, _parentNode=_parentNode@entry=0x555555630940, _joint=...)
    at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/detail/Skeleton.hpp:51
#11 0x000055555559bde0 in dart::dynamics::BodyNode::moveTo<dart::dynamics::WeldJoint> (this=this@entry=0x55555563c140, _newParent=_newParent@entry=0x555555630940, _joint=...)
    at /usr/src/debug/libdart/dart-6.13.0/dart/dynamics/detail/BodyNode.hpp:53
#12 0x000055555558dfe1 in Issue1445_Collision_Test::TestBody (this=<optimized out>) at /usr/src/debug/libdart/dart-6.13.0/unittests/regression/test_Issue1445.cpp:140
#13 0x00005555555e217f in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x555555625e30, method=<optimized out>, 
    location=location@entry=0x5555555ecf36 "the test body") at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2443
#14 0x00005555555ea3cf in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x555555625e30, method=&virtual testing::Test::TestBody(), 
    location=location@entry=0x5555555ecf36 "the test body") at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2479
#15 0x00005555555d4227 in testing::Test::Run (this=this@entry=0x555555625e30) at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2517
#16 0x00005555555d4560 in testing::TestInfo::Run (this=0x555555624ac0) at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2693
#17 0x00005555555d460b in testing::TestCase::Run (this=0x555555624f80) at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2811
#18 0x00005555555dc0a7 in testing::internal::UnitTestImpl::RunAllTests (this=0x555555624c90) at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:5177
#19 0x00005555555e2f90 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=object@entry=0x555555624c90, method=<optimized out>, 
    location=location@entry=0x5555555f06d0 "auxiliary test code (environments or event listeners)") at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2443
#20 0x00005555555eaa1d in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x555555624c90, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x5555555dbef2 <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=location@entry=0x5555555f06d0 "auxiliary test code (environments or event listeners)") at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest.cc:2479
#21 0x00005555555d42e0 in testing::UnitTest::Run (this=0x55555560f320 <testing::UnitTest::GetInstance()::instance>)
    at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/include/gtest/gtest.h:1340
#22 0x00005555555eb5a9 in RUN_ALL_TESTS () at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/include/gtest/gtest.h:2341
#23 main (argc=<optimized out>, argv=0x7fffffffe1a8) at /usr/src/debug/libdart/dart-6.13.0/unittests/gtest/src/gtest_main.cc:36

I.e., it seems to fail on line 140 in the unit test.

auto fixedJoint = model2Body->moveTo<dart::dynamics::WeldJoint>(model1Body);

Steps to Reproduce

The crash can be reproduced in the regression test for #1445

Installation was done using the AUR package with the following compilation flags

    cmake .. \
        -DCMAKE_BUILD_TYPE='Debug' \
        -DCMAKE_INSTALL_PREFIX="/usr" \
        -DCMAKE_INSTALL_LIBDIR="lib" \
        -DDART_TREAT_WARNINGS_AS_ERRORS="off" \
        -DCMAKE_SKIP_INSTALL_RPATH=ON \
        -DDART_SKIP_FLANN=ON \
        -DDART_SKIP_PAGMO=ON \
        -DDART_VERBOSE=ON \
        -DDART_FAST_DEBUG=OFF

And options=(debug !strip) to keep debug symbols.

@oysstu oysstu added the type: bug Indicates an unexpected problem or unintended behavior label Oct 12, 2023
@oysstu
Copy link
Author

oysstu commented Oct 12, 2023

There's also a workaround in gz-physics that may or may not be outdated.
https://github.com/gazebosim/gz-physics/blob/1445341181e9b81eaacfee504ccc8fed0f949c9a/dartsim/src/SDFFeatures.cc#L1186-L1190

The original issue was reported in gz-physics here:
gazebosim/gz-physics#268

Let me know if I should raise an issue in gz-physics is this is indeed outdated

@oysstu
Copy link
Author

oysstu commented Oct 13, 2023

The crash does not occur on the main/dart7 branch. I tried to go through the commits to see what might be the cause, but found it very hard to pin something down due to the large structural changes for dart7.

Edit: still happens on main/dart7. I had previously built it without the GLIBCXX_ASSERTIONS flag

@oysstu
Copy link
Author

oysstu commented Oct 16, 2023

I've narrowed it down to the following.

Archlinux compiles with CXXFLAGS="$CFLAGS -Wp,-D_GLIBCXX_ASSERTIONS" by default. The GLIBCXX_ASSERTIONS flag adds runtime checks for a few additional things, even in release build. One of those is bounds checking for the []-operator.
In BodyNode::dirtyTransform the mTreeCache vector of Skeleton is accessed using the following macro
https://github.com/dartsim/dart/blob/2d3cbdff352b0236b18c85e06f8047a84b3d1098/dart/dynamics/BodyNode.cpp#L165C1-L169C36

Used from here
https://github.com/dartsim/dart/blob/c5b8f0abfb2545754ebc872e4aad78020da88a62/dart/dynamics/BodyNode.cpp#L1509C1-L1520C4

Since this happens during moveBodyNodeTree, the mTreeCache vector could be in an intermediate state. The vector might have the sufficient room reserved, but not yet resized, or the code might actually access out of bounds memory.

@jslee02
Copy link
Member

jslee02 commented Dec 13, 2023

Thank you for tracking down this tricky issue! Feel free to submit a PR for this, or I could work on it when I have a chance.

@oysstu
Copy link
Author

oysstu commented Dec 14, 2023

Well, I started diving into the problem but the codebase got increasingly more complex, and I had to move on as I'm not familiar with the overall structure of the library. I may revisit this issue in the future if it does not get solved, but I unfortunately don't have time to pursue it right now.

@jslee02
Copy link
Member

jslee02 commented Dec 14, 2023

Sure, your insights are already helpful, greatly appreciated, and will guide my further investigation. Thank you for your initial efforts!

@jslee02
Copy link
Member

jslee02 commented Dec 19, 2023

This should be resolved by #1778. Please let me know if this issue persists. Thank you for the error report, again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants