Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::runtime_error what(): 'data' is empty with component container multithread #2423

Open
giafranchini opened this issue Feb 8, 2024 · 10 comments
Labels
more-information-needed Further information is required

Comments

@giafranchini
Copy link

giafranchini commented Feb 8, 2024

My setup:

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • ROS2 Iron from source
  • Version or commit hash:
  • DDS implementation:
    • rmw_fastrtps_cpp
  • Client library (if applicable):
    • rclcpp

I am developing a state estimation application based on fuse package on my robot. I would like to have two separate callback groups, a dedicated one for a rclpp::Timer here, and the node default one for handling callbacks from rclcpp::Subscriptions. In order for the two to work in parallel, I would like to use a multithreaded executor with num_threads = 2. However, when I launch my node inside a component_container_mt it shuts down with the following error:

[component_container_mt-3] terminate called after throwing an instance of 'std::runtime_error' [component_container_mt-3] what(): 'data' is empty

I am running into this issue while deploying the package on my robot architecture, which is a VersaLogic Copperhead VL-EBX-41.

@clalancette
Copy link
Contributor

Can you please provide some example code that shows the problem? That would help a lot in debugging this.

@clalancette clalancette added the more-information-needed Further information is required label Feb 8, 2024
@giafranchini
Copy link
Author

It's difficult to provide a simple example since fuse is a quite articulated package and a lot of its subsytems are needed to reproduce. I can try to explain better:

There is a main node, the optimizer, which during construction:

  • starts a timer in a dedicated callback group;
  • initializes one or more plugins which interface with external sensors via subscriptions.

During plugins initialization, a new executor and callback group are created with the same context as the main node, and a new thread is started to let the executor spin. The process can be found here: https://github.com/giafranchini/fuse/blob/25b4b9ec4a36886a726f661818cbff4abeebb7d4/fuse_core/src/async_sensor_model.cpp#L59

@fujitatomoya
Copy link
Collaborator

fujitatomoya commented Feb 8, 2024

#2242 looks similar problem. Edit: probably not, sorry

@giafranchini
Copy link
Author

giafranchini commented Feb 12, 2024

I setted up a minimal example in which I am able to reproduce the error. Please follow these steps:

This will run a simple example using bags already available in the package.
I also report here the core part of the launch file:

container_fuse = ComposableNodeContainer(
        name='container_fuse',
        namespace='',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[
            ComposableNode(
                package='fuse_optimizers',
                plugin='fuse_optimizers::FixedLagSmootherComponent',
                name='state_estimator',
                parameters=[PathJoinSubstitution([
                    pkg_dir, 'config', 'fuse_simple_tutorial.yaml'
                ])],
                extra_arguments=[{'use_intra_process_comms': True}],
            ),
        ],
        output='both',
)

@alsora
Copy link
Collaborator

alsora commented Feb 12, 2024

@giafranchini could you also provide a backtrace for the error?

@giafranchini
Copy link
Author

giafranchini commented Feb 12, 2024

Sure, here it is:

[component_container_mt-1] [INFO] [1707748366.188400210] [container_fuse]: Load Library: /home/giacomo/ros2_iron_ws/install/fuse_optimizers/lib/libfixed_lag_smoother_component.so
[component_container_mt-1] [INFO] [1707748366.258572968] [container_fuse]: Found class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.258688353] [container_fuse]: Instantiate class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.447123820] [state_estimator]: Received a set_pose request (stamp: 1703239942730964210, x: 0, y: 0, z: 0, roll: 0, pitch: 0, yaw: 0)
[component_container_mt-1] [WARN] [1707748366.468987638] [fuse]: Second argument to joinTopicName is absolute! Returning it.
[component_container_mt-1] [INFO] [1707748366.471217372] [state_estimator]: FixedLagSmootherComponent constructed
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/state_estimator' in container '/container_fuse'
[component_container_mt-1] terminate called after throwing an instance of 'std::runtime_error'
[component_container_mt-1]   what():  'data' is empty
[ERROR] [component_container_mt-1]: process has died [pid 76513, exit code -6, cmd '/home/giacomo/ros2_iron/install/rclcpp_components/lib/rclcpp_components/component_container_mt --ros-args -r __node:=container_fuse -r __ns:=/ -p use_sim_time:=True'].

I also noticed that the error does not show up immediately every time I launch the container: sometimes the node is able to run for some small time before dying with the same error.

@alsora
Copy link
Collaborator

alsora commented Feb 12, 2024

@giafranchini that's not a backtrace, but just the console log.
See here a tutorial on how to get a backtrace: https://navigation.ros.org/tutorials/docs/get_backtrace.html

@giafranchini
Copy link
Author

Sorry, here it is:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737104520768)
    at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737104520768)
    at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737104520768, signo=signo@entry=6)
    at ./nptl/pthread_kill.c:89
#3  0x00007ffff6642476 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#4  0x00007ffff66287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff6aa2b9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff6aae20c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff6aae277 in std::terminate() ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff6aae4d8 in __cxa_throw ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fffbd88e75d in fuse_core::CallbackAdapter::execute (
    this=<optimized out>, data=...)
    at /home/giacomo/ros2_iron_ws/src/fuse/fuse_core/src/callback_wrapper.cpp:113
#10 0x00007ffff790b95c in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#11 0x00007ffff7936b91 in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#12 0x00007ffff7938c4d in void std::__invoke_impl<void, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(std::__invoke_memfun_deref, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#13 0x00007ffff7938b5f in std::__invoke_result<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>::type std::__invoke<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#14 0x00007ffff7938a68 in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>)
    () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#15 0x00007ffff79389ab in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::operator()<, void>() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#16 0x00007ffff7938952 in void std::__invoke_impl<void, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, --Type <RET> for more, q to quit, c to continue without paging--
unsigned long))(unsigned long)>>(std::__invoke_other, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#17 0x00007ffff79388fb in std::__invoke_result<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>::type std::__invoke<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>(std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&)
    () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#18 0x00007ffff793889c in void std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::_M_invoke<0ul>(std::_Index_tuple<0ul>) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#19 0x00007ffff793886c in std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::operator()() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#20 0x00007ffff793884c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > > >::_M_run() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#21 0x00007ffff6adc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#22 0x00007ffff6694ac3 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#23 0x00007ffff6726850 in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

@dsandber
Copy link

dsandber commented Mar 16, 2024

I'm seeing this also with Ubuntu 22.04, ROS Iron from ubuntu binaries, and the multi-threaded executor.

move_group-3] terminate called after throwing an instance of 'std::runtime_error'
[move_group-3]   what():  'data' is empty
[move_group-3] Stack trace (most recent call last) in thread 1495:
[move_group-3] #13   Object "/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1", at 0xffffffffffffffff, in
[move_group-3] #12   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5fb5edb, in
[move_group-3] #11   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5f4d5c7, in
[move_group-3] #10   Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0xffffb61831fb, in
[move_group-3] #9    Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb643b32b, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
[move_group-3] #8    Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb642b297, in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&)
[move_group-3] #7    Object "/opt/ros/iron/lib/librclcpp_action.so", at 0xffffb48b0b0b, in rclcpp_action::ServerBase::exec

@giafranchini, found a work-around other than not using the multi-threaded executor?

I have my own thread publishing JointStates -- is that problematic?

@giafranchini
Copy link
Author

@dsandber I found out that in my application there is a class which inherits from rclcpp::Waitable which is causing the error since sometimes with the multi-threaded executor the data argument in the class execute method is nullptr, but I am not sure why this is happening. Anyway, I was able to make it work adding the waitables to a dedicated Reentrant callback group. Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-information-needed Further information is required
Projects
None yet
Development

No branches or pull requests

5 participants