Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rviz2 Docking Panel plugin randomly crashes when system is under high load #4689

Closed
azeey opened this issue Sep 24, 2024 · 8 comments
Closed

Comments

@azeey
Copy link
Contributor

azeey commented Sep 24, 2024

Bug report

Required Info:

  • Operating System: Ubuntu 24.04 inside Docker
  • ROS2 Version: Rolling, built from source
  • Version or commit hash: 771eca4
  • DDS implementation: default

We've been experiencing a lot of random crashes while preparing the Gazebo Ionic demo that features Nav2 (see https://github.com/gazebosim/ionic_demo). It seems to be related to system load as it occurred more frequently when I was on a video call testing out the demo.

Steps to reproduce issue

  1. Run stress to create high load on your machine. I did stress -c 16 -m 8 on my laptop 16 cores, 32GB RAM
  2. ros2 launch nav2_bringup tb4_simulation_launch.py headless:=False
    • You might have to run this a few times depending on your system

Expected behavior

rviz2 runs without issues

Actual behavior

rviz will start and crash immediately.

The backtrace from a core dump points to DockingPanel

#0  0x0000790b703b9bdb in rclcpp::ParameterValue::get<(rclcpp::ParameterType)9> (this=0x20) at /usr/src/ros-rolling-rclcpp-28.3.3-1noble.20240729.171300/include/rclcpp/parameter_value.hpp:244
#1  rclcpp::Parameter::get_value<(rclcpp::ParameterType)9> (this=0x0) at /usr/src/ros-rolling-rclcpp-28.3.3-1noble.20240729.171300/include/rclcpp/parameter.hpp:119
#2  rclcpp::Parameter::as_string_array[abi:cxx11]() const (this=0x0) at /usr/src/ros-rolling-rclcpp-28.3.3-1noble.20240729.171300/src/rclcpp/parameter.cpp:141
#3  0x0000790b08d617dd in nav2_rviz_plugins::pluginLoader(std::shared_ptr<rclcpp::Node>, bool&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, QComboBox*) () at /root/ws/install/nav2_rviz_plugins/lib/libnav2_rviz_plugins.so
#4  0x0000790b08c8e2d3 in nav2_rviz_plugins::DockingPanel::timerEvent(QTimerEvent*) () at /root/ws/install/nav2_rviz_plugins/lib/libnav2_rviz_plugins.so
#5  0x0000790b707b924b in QObject::event(QEvent*) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#6  0x0000790b70b91d45 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /lib/x86_64-linux-gnu/libQt5Widgets.so.5
#7  0x0000790b7078b118 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#8  0x0000790b707e75ab in QTimerInfoList::activateTimers() () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#9  0x0000790b707e7f11 in ??? () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#10 0x0000790b6e88e5b5 in ??? () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x0000790b6e8ed717 in ??? () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x0000790b6e88da53 in g_main_context_iteration () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x0000790b707e8279 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x0000790b70789a7b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x0000790b707923e8 in QCoreApplication::exec() () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x000063239d56bc97 in main (argc=6, argv=0x7ffd9082f8c8) at /usr/src/ros-rolling-rviz2-14.2.5-1noble.20240820.020548/src/main.cpp:92

Additional information

The crash doesn't seem to happen if I disable use_composition.

@ajtudela
Copy link
Contributor

It seems to be related to the timerEvent that waits until the docking_server is up and it loads the plugins. Your docking server is not running when it crashes, right?

The plugins loader is also used in the SelectorPanel, does this happen to you when you enable the SelectorPanel?

@SteveMacenski
Copy link
Member

@azeey can you respond to @ajtudela's request for info? He's the original author of that panel and knows it best to solve the issue with some info.

@azeey
Copy link
Contributor Author

azeey commented Oct 2, 2024

Sorry, this slipped my mind. Last I checked, it didn't happen with the SelectorPanel enabled as long as the Docking panel is disabled. I can check again tomorrow if you'd like.

@SteveMacenski
Copy link
Member

Thanks!

@ajtudela
Copy link
Contributor

ajtudela commented Oct 4, 2024

I'm trying to reproduce the crash with my setup (Ubuntu 24.04, rolling, main) without success. Sometimes, when the cpu is under stress, rviz hangs for a few seconds, but it recovers.

However, I'm working on an improved state machine for the panel that will hopefully fix this and this: #4458 (comment)

@ajtudela
Copy link
Contributor

ajtudela commented Oct 7, 2024

I was a race condition, difficult to catch, but I fixed!

@SteveMacenski could you check this branch: https://github.com/ajtudela/navigation2/tree/improve_panel using the new non-charging dock to check there is no issues?

Thanks

@SteveMacenski
Copy link
Member

SteveMacenski commented Oct 7, 2024

Software-wise it looks good! A few nits like when run() waiting on the action server, log something to let the user know its waiting on something

Does this solve the crash? If so, I can test the state machine, but I trust @ajtudela did this well 😄

@SteveMacenski
Copy link
Member

#4717 resolves

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants