Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure: CowelStateDerivative "SIGSEGV, si_code: 0 (memory access violation at address: 0x00000000)" #136

Closed
mfacchinelli opened this issue Feb 17, 2017 · 14 comments

Comments

@mfacchinelli
Copy link
Contributor

While running the tests, test 73 test_CowellStateDerivative fails. My operating system is macOS 10.12.3. The log file is attached.

LastTest.txt

@DominicDirkx
Copy link
Member

It seems like this is a 'real' failure one of the two test cases in this file. All other tests seem to run fine, so I don't think its any cause for concern.

However, I would like to have a closer look to make sure. Would you have time next Thursday after/in the break of the lecture?

@mfacchinelli
Copy link
Contributor Author

Ok, I'll be there. Thank you!

@GigiLaan
Copy link

I have the same error, operating system MacOS 10.11.6.
I attached my own log file.

LastTest.txt

@DominicDirkx
Copy link
Member

One more report of this in issue #140

@mfacchinelli
Copy link
Contributor Author

screen shot 2017-02-23 at 14 51 01

@magnific0 magnific0 changed the title Test 73 failure (macOS 10.12.3) Test failure: CowelStateDerivative "SIGSEGV, si_code: 0 (memory access violation at address: 0x00000000)" Feb 23, 2017
@transferorbit
Copy link

For what it’s worth, I’ve just encountered exactly the same failure when running this unit test from the command line on macosx 10.12.3.

unknown location:0: fatal error in "**testCowellPopagatorKeplerCompare**": signal: SIGSEGV, si_code: 0 (memory access violation at address: 0x00000000)
/Users/kevin/tudatBundle/tudat/Tudat/Astrodynamics/Propagators/UnitTests/unitTestCowellStateDerivative.cpp:479: last checkpoint
*** 1 failure detected in test suite "Master Test Suite"

@DominicDirkx
Copy link
Member

@eurospaceflight : Thanks for letting me know!

Could you do me a favor and recompile the unit test with debug symbols on: Change

CMAKE_BUILD_TYPE:STRING=Release

to

CMAKE_BUILD_TYPE:STRING=Debug

in the CMakeCache.txt file. This file should be in your build folder. Changing this will force the code to recompile all required libraries. After recompiling, could you run the debugger on the executable in the terminal:

gdb ./test_CowellStateDerivative

run

Then, after the program terminates, typing the command:

backtrace

should give a long list of function calls/line. Could you post this output here? It would help a lot in figuring out this issue. Let me know,

Cheers,

Dominic

@transferorbit
Copy link

transferorbit commented Mar 4, 2017

Hello Dominic,

I attempted to follow the above steps as closely as possible; I did so as follows:

  • In the file “CMakeCache.txt” I entered CMAKE_BUILD_TYPE:STRING=Debug

    • Note: originally this file was missing Release completely and contained only CMAKE_BUILD_TYPE:STRING=
  • I ran lldb ./test_CowellStateDerivative

  • I then entered run, which produced the following output:

Process 77055 launched: './test_CowellStateDerivative' (x86_64)
Running 2 test cases...
Warning, position of Mars taken as barycenter of that body's planetary system.
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision

*** No errors detected
Process 77055 exited with status = 0 (0x00000000)
  • Finally, I entered the bt command (apparently the backtrace equivalent). However, that produced the following result:
    error: invalid thread
    • I also immediately tried thread backtrace and bt all, but I received the same error: invalid thread message.

This is as far as I was able to get; I don’t know where to go from here. Do you have any further suggestions?

@magnific0
Copy link
Member

@eurospaceflight, thanks for running the debugger for us. bt is just the shorthand for backtrace, bt works on gdb as well.

The reason behind the invalid threads messages is because the process exited without the segmentation fault. So if you compile the debug binaries the problem apparently goes away.

This is unfortunately common for such errors which are sensitive to the compiler.

Could you try again, but with the release binary? The output won't be as informative, but at least it's somenhing.

@transferorbit
Copy link

Your wish is my command line. I recompiled for Release instead of Debug and reran as follows:

  • In the file “CMakeCache.txt” I entered CMAKE_BUILD_TYPE:STRING=Release
  • I ran lldb ./test_CowellStateDerivative
  • I then entered run, which produced the following output:
Process 21505 launched: './test_CowellStateDerivative' (x86_64)

Running 2 test cases...
Warning, position of Mars taken as barycenter of that body's planetary system.
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Warning, tabulated ephemeris is being reset using data at different precision
Process 21505 stopped
* thread #1: tid = 0x2d1c0, 0x00000001000f8716 test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::integrateEquationsOfMotion(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&) + 1254, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00000001000f8716 test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::integrateEquationsOfMotion(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&) + 1254
test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::integrateEquationsOfMotion:
->  0x1000f8716 <+1254>: movdqa %xmm0, (%rax)
    0x1000f871a <+1258>: xorl   %esi, %esi
    0x1000f871c <+1260>: jmp    0x1000f8721               ; <+1265>
    0x1000f871e <+1262>: movq   %r13, (%rbx)
  • Finally, I entered the bt command, which produced the following result:
* thread #1: tid = 0x2d1c0, 0x00000001000f8716 test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::integrateEquationsOfMotion(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&) + 1254, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001000f8716 test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::integrateEquationsOfMotion(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&) + 1254
    frame #1: 0x00000001000f65d8 test_CowellStateDerivative`tudat::propagators::SingleArcDynamicsSimulator<double, tudat::Time>::SingleArcDynamicsSimulator(std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, boost::shared_ptr<tudat::simulation_setup::Body>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, boost::shared_ptr<tudat::simulation_setup::Body> > > > const&, boost::shared_ptr<tudat::numerical_integrators::IntegratorSettings<tudat::Time> >, boost::shared_ptr<tudat::propagators::PropagatorSettings<double> >, bool, bool, bool) + 1016
    frame #2: 0x00000001000132ad test_CowellStateDerivative`void tudat::unit_tests::test_cowell_propagator::testCowellPropagationOfKeplerOrbit<tudat::Time, double>() + 5213
    frame #3: 0x000000010000cdb3 test_CowellStateDerivative`tudat::unit_tests::test_cowell_propagator::testCowellPopagatorKeplerCompare_invoker() + 19
    frame #4: 0x00000001003e3771 test_CowellStateDerivative`boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<boost::unit_test::callback0<boost::unit_test::ut_detail::unused> > >::invoke() + 17
    frame #5: 0x00000001003d1e20 test_CowellStateDerivative`boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) + 160
    frame #6: 0x00000001003d1ef7 test_CowellStateDerivative`boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) + 39
    frame #7: 0x00000001003e34ff test_CowellStateDerivative`boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) + 159
    frame #8: 0x00000001003d7ec8 test_CowellStateDerivative`boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) + 168
    frame #9: 0x0000000100406249 test_CowellStateDerivative`boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) + 297
    frame #10: 0x0000000100406269 test_CowellStateDerivative`boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) + 329
    frame #11: 0x00000001003d58b1 test_CowellStateDerivative`boost::unit_test::framework::run(unsigned long, bool) + 4001
    frame #12: 0x00000001003e19e3 test_CowellStateDerivative`boost::unit_test::unit_test_main(boost::unit_test::test_suite* (*)(int, char**), int, char**) + 211
    frame #13: 0x00007fff904d9255 libdyld.dylib`start + 1
    frame #14: 0x00007fff904d9255 libdyld.dylib`start + 1

@DominicDirkx
Copy link
Member

@eurospaceflight Like @magnific0 said, this seems to be one of those fun little errors that goes away when you start looking for it (sort like Schrödinger's bug...). I'll run the program with valgrind on my computer, hopefully that will give us some extra information.

@DominicDirkx
Copy link
Member

DominicDirkx commented Mar 7, 2017

@eurospaceflight I have a few other ideas to try to figure out this bug. It all seems to be happening in the execution of the integration of the second unit test.

First, could you comment out the last three, last two and last of the following lines:

  testCowellPropagationOfKeplerOrbit< double, double >( );
   testCowellPropagationOfKeplerOrbit< double, long double >( );
   testCowellPropagationOfKeplerOrbit< Time, double >( );
   testCowellPropagationOfKeplerOrbit< Time, long double >( );

way at the bottom of the file, and let me know what the result is? This will let us check if the problem is with a specific combination of state scalar/time types.

Second, could you try changing:

 SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                    bodyMap, integratorSettings, propagatorSettings, true, false, true );

        Eigen::Matrix< StateScalarType, 6, 1  > initialKeplerElements =
            orbital_element_conversions::convertCartesianToKeplerianElements< StateScalarType >(
                Eigen::Matrix< StateScalarType, 6, 1  >( systemInitialState ), effectiveGravitationalParameter );


        // Compare numerical state and kepler orbit at each time step.
        boost::shared_ptr< Ephemeris > moonEphemeris = bodyMap.at( "Moon" )->getEphemeris( );
        double currentTime = initialEphemerisTime + buffer;
        while( currentTime < finalEphemerisTime - buffer )
        {

            Eigen::VectorXd stateDifference
                = ( orbital_element_conversions::convertKeplerianToCartesianElements(
                    propagateKeplerOrbit< StateScalarType >( initialKeplerElements, currentTime - initialEphemerisTime,
                                          effectiveGravitationalParameter ),
                    effectiveGravitationalParameter )
                - moonEphemeris->template getTemplatedStateFromEphemeris< StateScalarType >( currentTime ) ).
                    template cast< double >( );

            for( int i = 0; i < 3; i++ )
            {
                BOOST_CHECK_SMALL( stateDifference( i ), 1E-3 );
                BOOST_CHECK_SMALL( stateDifference( i  + 3 ), 1.0E-9 );

            }
            currentTime += 10000.0;
        }

to

    SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                    bodyMap, integratorSettings, propagatorSettings, true, false, true );

And see if it runs properly? I'd be quite surprised to see any change, but you never know. Afterwards, could change it to:

```
SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                bodyMap, integratorSettings, propagatorSettings, true, false, false );

and try again?

With a little luck, this last one will run, and give us some idea to how to fix it. The main difference in this last one is that the numerically integrated state is not used to update the ephemeris of the propagated body.

Let me know, whenever you have time, what the outcome is,

Cheers,

Dominic

@DominicDirkx
Copy link
Member

There has not been any progress on this issue for some time. Also, there have been no reports from new users of this issue occurring.

@transferorbit Could you pull the development branch (tudatBundle and tudat) to see if this issue persists?

@DominicDirkx
Copy link
Member

There has been no progress/report on this in about 6 months. I'm closing this issue, if it reoccurs during Mac tests of the latest code, these should be tracked in a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants