Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High cpu usage when creating a node in on_activate of lifecycle node #1242

Closed
mdtoom opened this issue Jul 28, 2020 · 5 comments
Closed

High cpu usage when creating a node in on_activate of lifecycle node #1242

mdtoom opened this issue Jul 28, 2020 · 5 comments

Comments

@mdtoom
Copy link

mdtoom commented Jul 28, 2020

Bug report

Required Info:

  • Operating System:

    • Ubuntu 20.04
  • Installation type:

    • binaries
  • Version or commit hash:

    • 2.0.2
  • DDS implementation:

    • Fast-RTPS
  • Client library (if applicable):

    • rclcpp-lifecycle

Steps to reproduce issue

Run this node and start it using a life cycle manager.

#include <rclcpp/rclcpp.hpp>
#include <rclcpp_lifecycle/lifecycle_node.hpp>

using LifecycleCallbackReturn = rclcpp_lifecycle::node_interfaces::LifecycleNodeInterface::CallbackReturn;

class TestNode : public rclcpp_lifecycle::LifecycleNode
{
public:
    explicit TestNode(const rclcpp::NodeOptions options)
      : rclcpp_lifecycle::LifecycleNode("test_node", options)
    {}

    LifecycleCallbackReturn on_configure(const rclcpp_lifecycle::State&) { return LifecycleCallbackReturn::SUCCESS; }

    LifecycleCallbackReturn on_activate(const rclcpp_lifecycle::State&)
    {
        auto node = std::make_shared<rclcpp::Node>("service_node");
        return LifecycleCallbackReturn::SUCCESS;
    }

    LifecycleCallbackReturn on_deactivate(const rclcpp_lifecycle::State&) { return LifecycleCallbackReturn::SUCCESS; }

    LifecycleCallbackReturn on_cleanup(const rclcpp_lifecycle::State&) { return LifecycleCallbackReturn::SUCCESS; }

    LifecycleCallbackReturn on_shutdown(const rclcpp_lifecycle::State&) { return LifecycleCallbackReturn::SUCCESS; }

    LifecycleCallbackReturn on_error(const rclcpp_lifecycle::State&) { return LifecycleCallbackReturn::SUCCESS; }
};


int main(int argc, char * argv[])
{
  rclcpp::init(argc, argv);

  rclcpp::executors::SingleThreadedExecutor exec;
  rclcpp::NodeOptions options;

  auto scheduler = std::make_shared<TestNode>(options);
  exec.add_node(scheduler->get_node_base_interface());

  exec.spin();

  rclcpp::shutdown();

  return 0;
}

Expected behavior

Node should start and activate after which CPU usage should get low.

Actual behavior

Node does start (The lifecycle manager indicates it succeeded and I have had this issue with a more complex node as well, which responded after starting) but the CPU usage remains around 100% (viewed using htop).

cpu_usage_test_node

Additional information

Node creation is required to call a service in the on_activate

@iuhilnehc-ynos
Copy link
Collaborator

@mdtoom

I can reproduce the issue by your source code.
https://github.com/iuhilnehc-ynos/ros2_issues_test.git

# terminal 1
After building, source and run the test,
$ ros2 run rclcpp_issue_1242 main

# terminal 2
$ ros2 lifecycle set /test_node 1         // to configure
$ ros2 lifecycle set /test_node 3         // to active

# terminal 3
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s): 17.6 us,  1.0 sy,  0.0 ni, 78.1 id,  0.5 wa,  0.0 hi,  2.8 si,  0.0 st
MiB Mem :  15921.1 total,    333.2 free,  11939.9 used,   3648.0 buff/cache
MiB Swap:  34505.0 total,  34273.7 free,    231.3 used.   2430.4 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                
 575136 whoami    20   0  477976  20608  14612 R 100.0   0.1   3:34.90 main

I used 'RMW_IMPLEMENTATION=rmw_cyclonedds_cpp' to test again, the issue didn't happen.
Maybe this is an issue of FastRTPS.

Run the test with '--ros-args --log-level debug'

# FastRTPS

[DEBUG] [1595988117.881626160] [rcl]: Waiting without timeout
[DEBUG] [1595988117.881635321] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1595988117.881644793] [rcl]: Subscription in wait set is ready
[DEBUG] [1595988117.881652855] [rcl]: Guard condition in wait set is ready
[DEBUG] [1595988117.881666861] [rcl]: Subscription taking message
[DEBUG] [1595988117.881676646] [rcl]: Subscription take succeeded: false
[DEBUG] [1595988117.881685374] [rclcpp]: executor taking a message from topic '/parameter_events' failed to take anything
[DEBUG] [1595988117.881698240] [rcl]: Waiting without timeout
[DEBUG] [1595988117.881706297] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1595988117.881713893] [rcl]: Subscription in wait set is ready
[DEBUG] [1595988117.881720939] [rcl]: Guard condition in wait set is ready
[DEBUG] [1595988117.881729695] [rcl]: Subscription taking message
[DEBUG] [1595988117.881739067] [rcl]: Subscription take succeeded: false
[DEBUG] [1595988117.881747769] [rclcpp]: executor taking a message from topic '/parameter_events' failed to take anything
[DEBUG] [1595988117.881759514] [rcl]: Waiting without timeout
[DEBUG] [1595988117.881768658] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1595988117.881777401] [rcl]: Subscription in wait set is ready
[DEBUG] [1595988117.881785273] [rcl]: Guard condition in wait set is ready
[DEBUG] [1595988117.881795406] [rcl]: Subscription taking message
[DEBUG] [1595988117.881803383] [rcl]: Subscription take succeeded: false
[DEBUG] [1595988117.881810434] [rclcpp]: executor taking a message from topic '/parameter_events' failed to take anything
<try to get the message from topic '/parameter_events'  again and again> 

# CycloneDDS

[DEBUG] [1595988695.099822742] [rcl]: Waiting without timeout
[DEBUG] [1595988695.099845416] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1595988695.124204148] [rcl]: Subscription in wait set is ready
[DEBUG] [1595988695.124274849] [rcl]: Subscription taking message
[DEBUG] [1595988695.124286621] [rcl]: Subscription take succeeded: false
[DEBUG] [1595988695.124309165] [rclcpp]: executor taking a message from topic '/parameter_events' failed to take anything
[DEBUG] [1595988695.124345430] [rcl]: Waiting without timeout
[DEBUG] [1595988695.124353685] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1595988695.124362335] [rcl]: Guard condition in wait set is ready
[DEBUG] [1595988695.124373209] [rcl]: Waiting without timeout
[DEBUG] [1595988695.124380281] [rcl]: Timeout calculated based on next scheduled timer: false
<no  message any more>

@iuhilnehc-ynos
Copy link
Collaborator

After checking, I found it's a problem related to rmw_fastrtps_cpp.

https://github.com/ros2/rmw_fastrtps/blob/8fc9ca3d98a64118590a205b2162559e17309a27/rmw_fastrtps_shared_cpp/src/rmw_take.cpp#L77-L86

patch as follow,

diff --git a/rmw_fastrtps_shared_cpp/src/rmw_take.cpp b/rmw_fastrtps_shared_cpp/src/rmw_take.cpp
index 1bd9327..6d590fb 100644
--- a/rmw_fastrtps_shared_cpp/src/rmw_take.cpp
+++ b/rmw_fastrtps_shared_cpp/src/rmw_take.cpp
@@ -83,6 +83,8 @@ _take(
       }
       *taken = true;
     }
+  } else {
+    info->listener_->data_taken(info->subscriber_);        // maybe use a new name (such as 'check_data') to replace 'data_taken'
   }
 
   return RMW_RET_OK;

@fujitatomoya
Copy link
Collaborator

@iuhilnehc-ynos

it would be helpful if we make the PR to rmw_fastrtps !!!

CC: @MiguelCompany

@iuhilnehc-ynos
Copy link
Collaborator

@fujitatomoya

it would be helpful if we make the PR to rmw_fastrtps !!!

Got it. I'll do it.

@fujitatomoya
Copy link
Collaborator

@dirk-thomas

I think that you can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants