-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ros2] Improve executor destruction #860
Conversation
gazebo_ros/src/node.cpp
Outdated
@@ -28,6 +28,7 @@ std::mutex Node::lock_; | |||
|
|||
Node::~Node() | |||
{ | |||
executor_->remove_node(get_node_base_interface()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might not have enough insights into this, but do you really need this? I thought the executor only stores weak pointers to the node and removes them once not active anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a full grasp but the docs here: https://github.com/ros2/rclcpp/blob/master/rclcpp/include/rclcpp/executor.hpp#L135-L136 make me think that explicitly removing it might wake up an executor when just letting the weak reference expire might not. If that's the case my inclination would be to explicitly pass True rather than relying on it being the default value forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but do you really need this? I thought the executor only stores weak pointers to the node and removes them once not active anymore.
I'd also expect this not to be needed, but without explicitly removing the node, the executor never stops spinning even after all nodes it holds are destroyed.
my inclination would be to explicitly pass True rather than relying on it being the default value forever
Makes sense, I'll force it to be true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without explicitly removing the node, the executor never stops spinning even after all nodes it holds are destroyed
It might be worth checking with rclcpp devs whether relying on weak refs is the intention since any node could potentially be the last node and cause the indefinite spin that explicit removal and wakeup avoids. It might be that weak references are used just to protect from asynchronous node expiration but that explicit removal is still expected.
Edit: It needs to be said I'm speculating wildly. I'll be more impressed than anyone if I'm even half right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forced true in 28e84a5
So I realize that this PR didn't have enough context to begin with, sorry about that. Here's a summary of what happens when the node doesn't remove itself:
I added these print statements:
diff --git a/gazebo_ros/src/executor.cpp b/gazebo_ros/src/executor.cpp
index db4d6fb..f90d7c1 100644
--- a/gazebo_ros/src/executor.cpp
+++ b/gazebo_ros/src/executor.cpp
@@ -37,7 +37,9 @@ Executor::~Executor()
void Executor::run()
{
+std::cout << "spin" << std::endl;
spin();
+std::cout << "/spin" << std::endl;
}
void Executor::shutdown()
diff --git a/gazebo_ros/src/node.cpp b/gazebo_ros/src/node.cpp
index 7c7434b..ca73c18 100644
--- a/gazebo_ros/src/node.cpp
+++ b/gazebo_ros/src/node.cpp
@@ -28,7 +28,8 @@ std::mutex Node::lock_;
Node::~Node()
{
- executor_->remove_node(get_node_base_interface(), true);
+// executor_->remove_node(get_node_base_interface(), true);
+ std::cout << this << " ~Node" << std::endl;
}
Node::SharedPtr Node::Get(sdf::ElementPtr sdf)
diff --git a/gazebo_ros/test/test_node.cpp b/gazebo_ros/test/test_node.cpp
index 6a1c88c..3051f77 100644
--- a/gazebo_ros/test/test_node.cpp
+++ b/gazebo_ros/test/test_node.cpp
@@ -77,8 +77,11 @@ TEST(TestNode, GetSdf)
EXPECT_NE(node_1, node_2);
// Reset both
+std::cout << "Reset node 1: " << node_1 << std::endl;
node_1.reset();
+std::cout << "Reset node 2: " << node_2 << std::endl;
node_2.reset();
+std::cout << "Reseted both notes" << std::endl;
// Create another node
auto node_3 = gazebo_ros::Node::Get(plugin_sdf);
And get this output:
11: spin
11: Reset node 1: 0x55f7e35c2930
11: 0x55f7e35c2930 ~Node
11: Reset node 2: 0x55f7e483fe60
11: 0x55f7e483fe60 ~Node
# test is blocked here forever and eventually times out
So I'm not sure what's happening, but trying to destroy node_2
using node_2.reset()
blocks forever without this fix. I haven't dug into the rclcpp
code; there may be a problem there, but it's also possible we're just using it wrong.
For comparison, here's the result if I uncomment the remove_node
:
11: spin
11: Reset node 1: 0x564c17606600
11: 0x564c17606600 ~Node
11: Reset node 2: 0x564c18801b60
11: 0x564c18801b60 ~Node
11: /spin
11: Reseted both notes
11: [INFO] [gazebo_ros_node]: ROS was initialized without arguments.
11: spin
11: 0x564c178ee370 ~Node
11: /spin
Note how spin()
returns after the node_2
is reset, and re-spins for node_3
.
TODO(@chapulina) : rebase against new |
This improves the situation on #855 for me locally. It goes from always failing to failing once every 10 runs 😅