fix race conditions when updating PlanningScene #350

rhaschke · 2016-11-13T22:57:00Z

This a new attempt to resolve #232. The core new idea is to wait until the robot has stopped moving before claiming the execution of a trajectory has finished.

…ue_) Due to an upstream bug, it's not possible to start multiple AsyncSpinners from different threads. Filed PR: ros/ros_comm#867 The spinner is now only needed to serve our own callback_queue_ for scene updates, which is only required for syncSceneUpdates() that syncs all kind of scene updates, not only the robot state.

... because we might wait up to 1s for a robot state update

This reverts commit e3ef9a6.

davetcoleman · 2016-11-14T00:20:02Z

If I'm not mistaken, this PR is the same as #232 except you addressed your last comment in that issue, correct?

davetcoleman

Thanks - seems like a great addition, though I haven't tested it in runtime

davetcoleman · 2016-11-14T00:16:34Z

moveit_ros/planning/planning_scene_monitor/src/planning_scene_monitor.cpp

+  ros::WallTime start = ros::WallTime::now();
+  ros::WallDuration timeout(wait_time);
+
+  ROS_DEBUG_NAMED("planning_scene_monitor", "sync robot state to: %.3fs", fmod(t.toSec(), 10.));


"planning_scene_monitor" can just be LOGNAME for consistency

davetcoleman · 2016-11-14T00:21:03Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

@@ -1430,6 +1437,61 @@ bool TrajectoryExecutionManager::executePart(std::size_t part_index)
  }
 }

+bool TrajectoryExecutionManager::waitForRobotToStop(const TrajectoryExecutionContext &context, double wait_time)
+{
+  if (allowed_start_tolerance_ == 0)  // skip validation on this magic number


No, this is not a comparison against a small value.
Rather, we agreed that a value of zero would mean to skip validation completely.

allowed_start_tolerance_ is a double, which as I understand it you should never compare directly to 0

That's true if the result is an outcome of some computation. However, in this case, allowed_start_tolerance_ should be set from parameter server to be exactly zero.

davetcoleman · 2016-11-14T00:22:16Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

+  prev_state->enforceBounds();
+
+  // assume robot stopped when 3 consecutive checks yield the same robot state
+  unsigned int no_motion_count = 0;  // count iterations with no motion


this class (and most of moveit) uses std::size_t rather than unsigned int

std::size_t is required for array access. Here we count up to 3 only. No need for a 64bit integer ;-)

davetcoleman · 2016-11-14T00:22:40Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

+
+  // assume robot stopped when 3 consecutive checks yield the same robot state
+  unsigned int no_motion_count = 0;  // count iterations with no motion
+  while (time_remaining > 0. && no_motion_count < 3)


> epsilon

I don't see a reason to compare against epsilon here. time_remaining is not affected by precision issues. The resolution is defined by the clock only.

davetcoleman · 2016-11-14T00:23:43Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

+
+    // check for motion in effected joints of execution context
+    bool moved = false;
+    for (const auto &trajectory : context.trajectory_parts_)


do you plan on picking this to previous ROS versions? if so we should avoid auto

I think we should exploit C++ features in new code as much as possible. If we decide for backporting, this needs to be adapted indeed. But I mentioned that in the initial comment.

davetcoleman · 2016-11-14T00:25:19Z

moveit_ros/planning_interface/move_group_interface/src/move_group_interface.cpp

-                                             "but the full state is not known.");
+    if (!current_state_monitor_->waitForCurrentState(ros::Time::now(), wait_seconds))
+    {
+      ROS_ERROR("Failed to fetch current robot state");


davetcoleman · 2016-11-14T00:25:31Z

moveit_ros/planning_interface/test/CMakeLists.txt

@@ -5,4 +5,5 @@ if (CATKIN_ENABLE_TESTING)
  target_link_libraries(test_cleanup moveit_move_group_interface)

  add_rostest(python_move_group.test)
+  add_rostest(robot_state_update.test)


awesome, thanks!

davetcoleman · 2016-11-14T00:27:49Z

moveit_ros/planning_interface/test/robot_state_update.py

+    rospy.init_node(NODENAME)
+    rostest.rosrun(PKGNAME, NODENAME, RobotStateUpdateTest)
+
+    # suppress cleanup segfault


can you expand this comment to explain when this can be removed? this isn't necessary in kinetic right?

davetcoleman · 2016-11-14T00:30:01Z

moveit_ros/visualization/motion_planning_rviz_plugin/src/motion_planning_frame_planning.cpp

  robot_state::RobotState start = *planning_display_->getQueryStartState();
  updateQueryStateHelper(start, ui_->start_state_selection->currentText().toStdString());
  planning_display_->setQueryStartState(start);
 }

 void MotionPlanningFrame::useGoalStateButtonClicked()
 {
+  planning_display_->addBackgroundJob(boost::bind(&MotionPlanningFrame::useGoalStateButtonExec, this), "update goal "


Could you add a comment on why a background job is used for this click?

davetcoleman · 2016-11-14T00:30:04Z

moveit_ros/visualization/motion_planning_rviz_plugin/src/motion_planning_frame_planning.cpp

 }

 void MotionPlanningFrame::useStartStateButtonClicked()
 {
+  planning_display_->addBackgroundJob(boost::bind(&MotionPlanningFrame::useStartStateButtonExec, this), "update start "


Could you add a comment on why a background job is used for this click?

v4hn · 2016-11-14T11:32:27Z

format checking failed.

rhaschke · 2016-11-14T21:28:21Z

Damn formatting. clang essentially reformats long code line every time again...

davetcoleman · 2016-11-14T23:13:28Z

@v4hn can someone in your lab test this at runtime?

BrettHemes · 2016-11-15T15:47:28Z

@davetcoleman what exactly do you want tested?

I am in Indigo and now have this issue after recently reinstalling from source. I can try to pick this into indigo-devel and test on a real robot if it will help.

davetcoleman · 2016-11-15T17:03:07Z

@BrettHemes I'm not exactly sure - that updating the planning scene with collision objects still works and waiting until the robot trajectory stops executing before doing the next command? @rhaschke could better answer this question

Note that a few small changes are required for indigo

rhaschke · 2016-11-15T18:21:00Z

@BrettHemes It would be cool, if you could repeat the unit test (moveit_ros/planning_interface/test/robot_state_update.py) on a real robot setup. This test performs some random walk in joint space and should ensure, that the robot states reported after execution of a planned trajectory match the actual robot state. Additionally, as @davetcoleman suggested, it would be cool to test that collision objects obtained from depth sensors are correctly updated in the collision scene.

davetcoleman · 2016-12-15T19:38:23Z

@BrettHemes have you gotten a chance to test this on real hardware?

davetcoleman · 2017-01-05T07:28:46Z

I have run the test script locally without issue. Given the elapsed time of this PR I think we should merge this.

…ate(1.0) in TrajectoryExecutionManager::validate, cherry pick part of moveit#350

…ate(1.0) in TrajectoryExecutionManager::validate, cherry pick part of #350

cherry pick #350

mmoerdijk · 2018-04-03T14:15:31Z

I'm not sure if this issue is completely resolved. I'm still getting "Invalid Trajectory: start point deviates from current robot state" message. When using the moveitcommander and the group.go() command without delays to move the robot. The speed of the robot is quite high, this seems to increase the frequency of the problem.

I'm on kinetic and build move it from source using the 245c7b5 commit.

I spent quite some time at debugging and I can say the following things:

Joint state updates on the /joint_state topic are correct / no delay.
The waitForCurrentRobotState method actually waits for the next joint message
current_state_time_ and joint values received correctly in the jointStateCallback
Planning starts after waitForCurrentRobotState but still somehow uses old values

I found out that if I set update = true just before line 352 I can plan and execute in rapid succession.

So this probably has something to do with the callbacks that are called on line 405 of the current_state_monitor. But I don't understand why @rhaschke you seem to be the expert on this can you explain why always executing the callbacks makes a difference...?

rhaschke · 2018-04-03T15:14:03Z

@mmoerdijk Can you please verify that the unittest passes without issues.
Can you also list the callbacks called in your situation?
I don't think, semantically the callbacks should make a difference. However, they take some extra time and thus may affect the timing.
It would be perfect if you could provide a new or modify the existing unit test to reproduce your problem.

mmoerdijk · 2018-04-04T13:15:09Z

Just to be sure I removed my old moveit repo and freshly cloned the latest kinetic branch.

The unittest you mention passed without errors using catkin run_tests --no-deps moveit_ros_planning_interface. But running the unit test with my setup (directly running the python file) it still generates the same errors, with the modification it works without problems.

Obviously my setup has something to do with it. I'm using moveit in combination with the joint_trajectory_controller from ros_control and a custom hardware interface. The hardware interface is currently just mirroring the values from the joint_trajectory_controller. And looking at the joint_states topic doing it correctly. And the weird thing is, I added some debug code in jointStateCallback to output the joint values during the call to waitForCurrentState and there it outputs the correct joint values before the planning starts. But still the plan starts at older joint values.

I will try to make a minimal example setup but i don't have much time until the beginning of may.

As for the callbacks, there is only one callback and that is the call to planning_scene_monitor::PlanningSceneMonitor::onStateUpdate

rhaschke · 2018-04-04T16:31:53Z

planning_scene_monitor::PlanningSceneMonitor::onStateUpdate

This function actually performs the scene update, i.e. applies changes to the current state within the planning scene. Looks like that not all scene updates are triggered as necessary.

@mmoerdijk, could you please check whether this fix solves your issue.

mmoerdijk · 2018-04-05T07:37:49Z

@rhaschke I tried out your fix and now and it seems to solve my problem. Thanks

rhaschke added 7 commits November 11, 2016 12:35

PSM::waitForCurrentRobotState() + PSM::syncSceneUpdates()

512ded8

renamed wall_last_state_update_ to last_robot_state_update_wall_time_

24b8aa2

rviz: execute state update in background

4f2dd1b

... because we might wait up to 1s for a robot state update

add robot_state update test

8bdfcad

waitForRobotToStop()

1c90c23

Revert "wait a second before updating "current" in RViz (#291)"

edc8da2

This reverts commit e3ef9a6.

davetcoleman requested changes Nov 14, 2016

View reviewed changes

addressed Dave's comments

fc1e860

davetcoleman approved these changes Nov 14, 2016

View reviewed changes

davetcoleman merged commit a77b3da into moveit:kinetic-devel Jan 5, 2017

rhaschke deleted the fix-442 branch January 11, 2017 17:06

k-okada added a commit to k-okada/moveit that referenced this pull request Dec 10, 2017

use waitForCurrentState(ros::Time::now(), instead of waitForCurrentSt…

97d0414

…ate(1.0) in TrajectoryExecutionManager::validate, cherry pick part of moveit#350

k-okada mentioned this pull request Dec 10, 2017

use waitForCurrentState(ros::Time::now(), instead of waitForCurrentSt… #717

Merged

5 tasks

rhaschke mentioned this pull request Dec 12, 2017

Full back port of #350 and #298 #721

Closed

5 tasks

davetcoleman pushed a commit that referenced this pull request Jan 15, 2018

use waitForCurrentState(ros::Time::now(), instead of waitForCurrentSt…

66431c7

…ate(1.0) in TrajectoryExecutionManager::validate, cherry pick part of #350

rkeatin3 mentioned this pull request Jan 23, 2018

Invalid Trajectory: start point deviates from current robot state #749

Closed

rhaschke added a commit to ubi-agni/moveit that referenced this pull request Jan 25, 2018

cherry-pick unittest from moveit#350

355391c

rhaschke mentioned this pull request Jan 25, 2018

cherry pick #350 #752

Merged

v4hn added a commit that referenced this pull request Jan 26, 2018

Merge pull request #752 from ubi-agni/pr-cherry-pick-#350

d8150fc

cherry pick #350

rhaschke mentioned this pull request Apr 5, 2018

fix race condition #824

Merged

srsidd mentioned this pull request Apr 13, 2018

allowed_start_tolerance_ parameter is used incorrectly in waitForRobotToStop method in the trajectory_execution_manager #840

Closed

rhaschke mentioned this pull request Apr 20, 2018

cherry-pick #824 from kinetic-devel #848

Merged

avoelz mentioned this pull request Apr 27, 2018

'Failed to fetch current robot state' when using the 'plan_kinematic_path' service #868

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix race conditions when updating PlanningScene #350

fix race conditions when updating PlanningScene #350

rhaschke commented Nov 13, 2016

davetcoleman commented Nov 14, 2016

davetcoleman left a comment

davetcoleman Nov 14, 2016

davetcoleman Nov 14, 2016

rhaschke Nov 14, 2016

davetcoleman Nov 14, 2016

rhaschke Nov 14, 2016

davetcoleman Nov 14, 2016

rhaschke Nov 14, 2016

davetcoleman Nov 14, 2016

rhaschke Nov 14, 2016

davetcoleman Nov 14, 2016

rhaschke Nov 14, 2016

davetcoleman Nov 14, 2016

davetcoleman Nov 14, 2016

davetcoleman Nov 14, 2016

davetcoleman Nov 14, 2016

davetcoleman Nov 14, 2016

v4hn commented Nov 14, 2016

rhaschke commented Nov 14, 2016

davetcoleman commented Nov 14, 2016

BrettHemes commented Nov 15, 2016

davetcoleman commented Nov 15, 2016

rhaschke commented Nov 15, 2016

davetcoleman commented Dec 15, 2016

davetcoleman commented Jan 5, 2017

mmoerdijk commented Apr 3, 2018

rhaschke commented Apr 3, 2018

mmoerdijk commented Apr 4, 2018

rhaschke commented Apr 4, 2018

mmoerdijk commented Apr 5, 2018

fix race conditions when updating PlanningScene #350

fix race conditions when updating PlanningScene #350

Conversation

rhaschke commented Nov 13, 2016

davetcoleman commented Nov 14, 2016

davetcoleman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v4hn commented Nov 14, 2016

rhaschke commented Nov 14, 2016

davetcoleman commented Nov 14, 2016

BrettHemes commented Nov 15, 2016

davetcoleman commented Nov 15, 2016

rhaschke commented Nov 15, 2016

davetcoleman commented Dec 15, 2016

davetcoleman commented Jan 5, 2017

mmoerdijk commented Apr 3, 2018

rhaschke commented Apr 3, 2018

mmoerdijk commented Apr 4, 2018

rhaschke commented Apr 4, 2018

mmoerdijk commented Apr 5, 2018