Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added preemption to recovery server #1644

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions nav2_behavior_tree/include/nav2_behavior_tree/bt_action_node.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,11 @@ class BtActionNode : public BT::ActionNodeBase
{
}

// Can be overwritten to trigger specific behaviours when a new goal is sent to the BtNavigator
virtual void on_goal_preempted()
{
}

// Called upon successful completion of the action. A derived class can override this
// method to put a value on the blackboard, for example.
virtual BT::NodeStatus on_success()
Expand Down Expand Up @@ -140,6 +145,12 @@ class BtActionNode : public BT::ActionNodeBase
// user defined callback. May modify the value of "goal_updated_"
on_wait_for_result();

// Check if the navigation goal has been updated
if (config().blackboard->get<bool>("goal_preempted")) {
config().blackboard->set("goal_preempted", false);
on_goal_preempted();
Copy link
Member

@SteveMacenski SteveMacenski Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, can't recovery behaviors, while being preempted, be thought of as being cancelled? Since the preemption of the navigation request should effectively stop all recoveries, the difference between a cancellation and a preemption isn't really anything.

If that's true, then I think we can directly cancel or preempt the goal at this point. What you did instead looks like sending a new null goal that will immediately success in its current pose. I think using the goal_handle_ to preempt it is the right answer in this case (also, the action node is used for all servers including planning and recovery so this might call other non-recovery action servers). Then we can remove the on_goal_preempted methods from each implementation. Their respective recovery server implementations just need to handle the cancellation cleanly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea with this was to allow all action nodes to do something on preemption.
I think it also makes sense since the BtNavigator does not know which node is being executed at the moment, and if the preemption flag is not cleared after the preemption is processed, the recovery behaviours would be cancelled when executed after a preemption is done, no matter on which node.
I thought of this as a simple solution where whatever node is being executed, it is notified of the preemption and clears the preemption flag. Then the on_goal_preempted call can be overriten on each specific action node to trigger a different behavior if necessary.
What do you think?

Copy link
Member

@SteveMacenski SteveMacenski Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow all action nodes to do something on preemption

I think that sounds OK as long as there's actually a node that needs it. But I don't think that should be misconstrued with actually doing the preemption. If we're being asked to preempt or cancel, we should preempt or cancel, but then a separate method to also do something before or after sounds OK.

You could also just have the BT Action Node just preempt or cancel the servers running. You effectively do that by just setting a null goal that returns immediately, but really it should be dealt with in the server instance itself. That code also impacts the controller and planner and isn't rational for those cases.

The BT Action node is the interface for the BT to the action servers, if the navigation request was preempted or cancelled, the respective implementations should handle and communicate that to their servers. For the recoveries, that means push along the preempt / cancellation request for it to handle (which they should then just immediately return). For the controller, probably nothing needs to change on preemption (cancel should stop). For the planner cancel should stop, but unclear if the preemption should stop as well, I'm leaning towards "yes" but that's a topic for another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't think that should be misconstrued with actually doing the preemption

Maybe we can then change the call from on_goal_preempted to on_navigation_goal_preempted or something like that? I understand that it is quite confusing to talk about preemption of the server instances (local to each server's behaviour) and the navigation goal (a global goal).

That code also impacts the controller and planner and isn't rational for those cases.

I believe that by using a similar structure as we do with on_tick or on_wait_for_result, we should not affect any node that does not define a specific behaviour for on_goal_preempted, except from the extra management of the preemption flag. This shouldn't add too much overhead, but could it be an issue?

if the navigation request was preempted or cancelled, the respective implementations should handle and communicate that to their servers.

The existing implementation handles the cancel on the halt function of the BT Action Node, effectively sending a cancel request to their respective servers. The problem with the halt is that it recursively stops the activity of all nodes in the tree. The reason to not do a similar thing on a preemption is that as you mention, the preemption shouldn't affect all nodes equally.
The intention of adding a new function when a navigation preemption is recieved is to allow the nodes to do something about it. This is at the moment only interesting for the recoveries, but it opens new chances for other action nodes to change their behaviour when the global goal changes.

For the recoveries, that means push along the preempt / cancellation request for it to handle (which they should then just immediately return)

If I understand you correctly, you would like to have the action servers handle the global goal preemption on their own. To do so, we would need to add a new callback on the action server to handle the preemptions. I could do that by subclassing the action server to a recovery server maybe, but I believe it won't make much sense to add that callback on the Simple Action Server.

Copy link
Member

@SteveMacenski SteveMacenski Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these functions make things more confusing. This should be relatively simple. You have a behavior tree spinning and when you call a cancel or preempt at least 1 BT node is active (probably multiple, the controller which is always running basically and potentially then a recovery or planner or something). My understanding of this at the moment is that if you hit cancel, all of the BT action Node derived classes (spin, wait, backup, planner, controller) will stop (via a async_cancel_goal in halt).

So when the parent navigation attempt requests a preemption, what its asking for is to stop what you're doing and start proceeding to the next goal smoothly. That means the controller should keep going for the short period of time until the new plan is handed to it. that means the planner, if its doing something, should immediately stop and be given a new goal to compute (that could be done via a preemption, also done for you, no changes required here) and the recoveries should never be immediately recalled until at that point a planner or controller failure has occurred. That is the only behavior change required.

Therefore, I think the on_preemption() for the recoveries BT nodes should do an async cancel and the planner should be given a new goal with the new goal flag. So I think the function you've made, if you would like to keep that to implement preemption for servers, should have an implementation of cancels for the recoveries, preempt with the updated goal for the planners, and nothing (?) for the controllers. In that way, you should probably make a BT-specific recoveries node that derives from the BTActionNode that the recovery actions implement, so that there's not code duplication with 3+ identical on_preemption()s.

You shouldn't touch halt that is correct. You also shouldn't need to change anything in the simple action server. You just need to check for preemption as well as cancel and then stop processing (which seems to be laid out for you as well here)

The major issue with what you proposed is that its a little hacky and could fail if some message gets delayed (since the new rotation angle is 0 when sent but then maybe moved 10 degrees with a tolerance of 4 degrees, so it'll start spinning back or something silly). Its also not really a general solution to just terminating all recoveries when a new navigation request comes in, which I think is expected behavior.

Copy link
Member

@SteveMacenski SteveMacenski Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps if there’s some “one-shot” solution in BT.CPP we could add in a new virtual method like halt for our needs. I don’t recall from our discussions if that would actually just solve everything. I think Davide would be open to it if its named well & has some other use for other people.

I think we’ve gotten really in the weeds here- I think the key is figuring out what we want each of the nodes to do (action ones and others). The control flow and non-action make this tricky. Maybe there’s a way to simplify that. I forget now what the current behavior even is for preemption and what’s specifically wrong with it 😅

Copy link
Member

@SteveMacenski SteveMacenski Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to say that the only thing right now that’s not acting properly on preemption is the recoveries? Or also planner? Mhm... ok so all action nodes... and even the throttle decorator since it should let it replan immediately...

This preemption thing is a pain.

What would happen if we sent a new command to a BT that was currently running without halting between? Would that have a similar behavior we’re looking for?

Copy link
Contributor Author

@gimait gimait Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue kind of addresses our problem: BehaviorTree/BehaviorTree.CPP#106

There it's mentioned that the CoroActionNode is useful for preemptions, however, I do not see any methods in other nodes able to act differently with this kind of node, meaning that for us it's the same.

We mainly want to be able to stop all recoveries when a new goal is sent. For the other nodes, I believe it is okay the way it is (planner and controller work as expected atm if the proper behaviour tree is built). This means that we have a tree of the shape:

We have a root node (RecoveryNode) that switches between two behaviours: SequenceStar and a main action.
We want to be able to cancel actions a, b and c no matter which of them is running and make the parent node return success, so the main action can start running.

To recap on what we talked so far:

  • We thought of using an ActionNode-specific method that we can call on preemption of the global goal, but there is a problem with the tree: Even if we stop the running action node, it's parent node is not stopped, it just jumps to the next action.
  • We thought of resetting the whole tree by somehow halting it without stopping the controller, but that implementation would fail if more than one controller is in use.
  • We discussed modifying the recovery control node to halt its second child on preemption. I.e., halt the recovery behaviour branch.
  • We also thought of using the blackboard to trigger the necessary actions, but that is discarded since there is no way of knowing how many nodes are running at the same time.
  • We cannot force a success with a decorator to the SequenceStar output since we want to be able to percive an error on the recovery actions (this could be useful if possible, as we could simply return failure from action nodes a, b or c and that would stop the recovery).

Maybe @facontidavide knows how we can fix this on a way that makes sense.

Copy link
Member

@SteveMacenski SteveMacenski Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had a halt counterpart in the tree node, couldn't we override it for control flow nodes and mess with the internal state to return success / failure? We could set a flag and next time it's ticked, we just return some value set by the flag. Its not ideal, but it could work (?) though it doesn't set the state immediately, its only after its reticked in the new preempted goal request.


Please look over my comment on that ticket and make sure it makes sense to you / is correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how that would work, but would we need to override the control nodes that we want to use then?

Please look over my comment on that ticket and make sure it makes sense to you / is correct

Thanks for commenting, I agree with your description, specially what you mentioned about adding the communication child to parent (I believe that would be something useful for the behavior tree itself and not just us). Let's continue this discussion in that thread, I'm sure Davide will know what can be done.

}

auto goal_status = goal_handle_->get_status();
if (goal_updated_ && (goal_status == action_msgs::msg::GoalStatus::STATUS_EXECUTING ||
goal_status == action_msgs::msg::GoalStatus::STATUS_ACCEPTED))
Expand Down
10 changes: 10 additions & 0 deletions nav2_behavior_tree/plugins/action/spin_action.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,22 @@ class SpinAction : public BtActionNode<nav2_msgs::action::Spin>
const std::string & action_name,
const BT::NodeConfiguration & conf)
: BtActionNode<nav2_msgs::action::Spin>(xml_tag_name, action_name, conf)
{
}

void on_tick() override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are static parameters in the xml, do we need to run these in on_tick?

Copy link
Contributor Author

@gimait gimait Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need them there if we want to send a new goal as preemption behaviour (the null goal I made), to reset the goal whenever we enter the node.
However, if I change this to simply cancel the recovery, it can stay in the constructor.

{
double dist;
getInput("spin_dist", dist);
goal_.target_yaw = dist;
}

void on_goal_preempted() override
{
goal_.target_yaw = 0;
goal_updated_ = true;
}

static BT::PortsList providedPorts()
{
return providedBasicPorts(
Expand Down
10 changes: 10 additions & 0 deletions nav2_behavior_tree/plugins/action/wait_action.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ class WaitAction : public BtActionNode<nav2_msgs::action::Wait>
const std::string & action_name,
const BT::NodeConfiguration & conf)
: BtActionNode<nav2_msgs::action::Wait>(xml_tag_name, action_name, conf)
{
}

void on_tick() override
{
int duration;
getInput("wait_duration", duration);
Expand All @@ -45,6 +49,12 @@ class WaitAction : public BtActionNode<nav2_msgs::action::Wait>

goal_.time.sec = duration;
}
void on_goal_preempted() override
{
// Stop the wait if a new goal has been preempted.
goal_.time.sec = 0;
goal_updated_ = true;
}

// Any BT node that accepts parameters must provide a requiredNodeParameters method
static BT::PortsList providedPorts()
Expand Down
2 changes: 2 additions & 0 deletions nav2_bt_navigator/src/bt_navigator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ BtNavigator::on_configure(const rclcpp_lifecycle::State & /*state*/)
blackboard_->set<std::chrono::milliseconds>("server_timeout", std::chrono::milliseconds(10)); // NOLINT
blackboard_->set<bool>("path_updated", false); // NOLINT
blackboard_->set<bool>("initial_pose_received", false); // NOLINT
blackboard_->set<bool>("goal_preempted", false); // NOLINT

// Get the BT filename to use from the node parameter
std::string bt_xml_filename;
Expand Down Expand Up @@ -221,6 +222,7 @@ BtNavigator::navigateToPose()
RCLCPP_INFO(get_logger(), "Received goal preemption request");
action_server_->accept_pending_goal();
initializeGoalPose();
blackboard_->set("goal_preempted", true);
}
topic_logger.flush();
};
Expand Down
12 changes: 7 additions & 5 deletions nav2_recoveries/include/nav2_recoveries/recovery.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -173,12 +173,14 @@ class Recovery : public nav2_core::Recovery
// TODO(orduno) #868 Enable preempting a Recovery on-the-fly without stopping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this TODO

if (action_server_->is_preempt_requested()) {
RCLCPP_ERROR(
node_->get_logger(), "Received a preemption request for %s,"
" however feature is currently not implemented. Aborting and stopping.",
node_->get_logger(), "Received a preemption request for %s,",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an info now (or even a debug after you get it working)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed that one. Do you think is better an info or debug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets do debug

recovery_name_.c_str());
stopRobot();
action_server_->terminate_current();
return;
action_server_->accept_pending_goal();
SteveMacenski marked this conversation as resolved.
Show resolved Hide resolved
if (onRun(action_server_->get_current_goal()) != Status::SUCCEEDED) {
RCLCPP_INFO(node_->get_logger(), "Preemption failed for %s", recovery_name_.c_str());
action_server_->terminate_current();
return;
}
}

switch (onCycleUpdate()) {
Expand Down