-
Notifications
You must be signed in to change notification settings - Fork 955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Support timed locking in PlanningSceneMonitor #2794
base: master
Are you sure you want to change the base?
Draft: Support timed locking in PlanningSceneMonitor #2794
Conversation
5386bf6
to
76f8891
Compare
...anning/planning_scene_monitor/include/moveit/planning_scene_monitor/planning_scene_monitor.h
Outdated
Show resolved
Hide resolved
@@ -1062,6 +1075,55 @@ bool PlanningSceneMonitor::getShapeTransformCache(const std::string& target_fram | |||
return true; | |||
} | |||
|
|||
UniqueTimedLock PlanningSceneMonitor::acquireUniqueLock(const std::string& source, std::chrono::milliseconds timeout) | |||
{ | |||
// TODO(henningkayser): lock/unlock octomap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The octomap could be locked here and then unlocked with the destructor. However, this would increase the time the octree is being locked since some callbacks haven't done this before. Any alternative suggestions?
This replaces the shared_mutex of the PSM with a shared_timed_mutex that allows specifying a timeout for attempting to lock the mutex for read and write using the LockedPlanningSceneRO and LockedPlanningSceneRW classes. In case of a failed attempt, a TryLockTimeoutException is thrown so that the calling thread is unblocked in case of concurrency issues. Additionally, a new scoped lock wrapper class TimedLock allows tracking lock sources (=caller names), start time, waiting time and the duration that the lock persists. An optional DiagnosticStatus publisher can be enabled for debugging these lock events at runtime. The scope-less locking functions (lockSceneRead(), etc...) are deprecated since they are error-prone and don't work well with the diagnostics feature.
76f8891
to
560188c
Compare
@@ -717,5 +813,8 @@ class LockedPlanningSceneRW : public LockedPlanningSceneRO | |||
{ | |||
return planning_scene_monitor_->getPlanningScene(); | |||
} | |||
|
|||
private: | |||
std::shared_ptr<UniqueTimedLock> ulock_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open for debate: The shared pointers here copy the old behavior. I'm thinking of not allowing this since it doesn't make a lot of sense to share write access between different entities. I'd suggest making LockedPlanningSceneRW
non-copyable. LockedPlanningSceneRO
should probably also not be using a shared pointer, acquiring another shared lock seems way more reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who are we sharing the lock with? Since the class constructs the lock itself and doesn't seem to hand it out at any point, it doesn't have be a shared pointer or even unique pointer in my opinion. It's just normal class member, right?
Just a high-level comment. I wonder what the overhead of a |
Good point! I'd be surprised if there was a measurable difference between |
Do you have an idea how to benchmark this? If yes, we could integrate it in #2717 |
@captain-yoshi and @felixvd here's a script that is good for benchmarking repeated, high-frequency reads of the planning scene. (#2698 (comment)) To properly test this PR, some writes to the planning scene should also be included (to mimic normal operation). For example, updating joint angles at 100-200 Hz or so. |
I'm also curious how |
From your descriptions it looks like there are some things left to do to cleanup this patch. I only skimmed through the changes for now. My main concern is backward compatibility. Is this patch supposed to be API compatible except for deprecating the non-RAII locking API (to which I definitely agree)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of having more insight into what's going on is really good. However, the diagnostics come with extra cost and I'm not sure having them included should be the default in this case.
class TimedLock | ||
{ | ||
public: | ||
TimedLock(const std::string& source, std::shared_timed_mutex& mtx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this implementation has much higher cost than just the switch from std::shared_mutex
to std::shared_timed_mutex
(the cost of which remains to be shown). My problem is how large this object is and how much data it stores even if one doesn't want to use the diagnostic features.
The following are now copied:
- String (if long, SSO will not help us)
- std::chrono::milliseconds object
In addition, construct_time_
and lock_time_
are default constructed every time together with a std::function
. I think we should aim for making this class as cheap as the underlying STL lock if no diagnostics are requested and as cheap as possible if one needs the diagnostics. Unfortunately I don't see a way to do that without making the diagnostics flag a compile time constant and rewriting large parts of the code here. Also, using diagnostics would then require recompiling large chunks of code, but even then I think that might be the better way to do this.
} | ||
|
||
private: | ||
const std::string source_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having const
member variable(s) disables moves for the class.
Hmm, also, locks can only be moved, so now I'm confused about how this class works in the first place. Anyway, it is pretty much always a good idea to not use constant member variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if (timeout_ == std::chrono::milliseconds::zero()) | ||
lock_.lock(); | ||
else if (!lock_.try_lock_for(timeout_)) | ||
throw TryLockTimeoutException("Timeout trying to lock PlanningSceneMonitor from '" + source_ + "'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who will catch this exception? If nobody catches it, then the program just terminates if we exceed timeout.
* @param timeout An optional duration that locking the shared mutex is allowed to take | ||
* @throws planning_scene_monitor::TryLockTimeoutException in case of a lock timeout | ||
*/ | ||
SharedTimedLock acquireSharedLock(const std::string& source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be even more obvious what is happening if this was called acquireReadLock
.
* @param timeout An optional duration that locking the unique mutex is allowed to take | ||
* @throws planning_scene_monitor::TryLockTimeoutException in case of a lock timeout | ||
*/ | ||
UniqueTimedLock acquireUniqueLock(const std::string& source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be even more obvious what is happening if this was called acquireWriteLock.
@@ -717,5 +813,8 @@ class LockedPlanningSceneRW : public LockedPlanningSceneRO | |||
{ | |||
return planning_scene_monitor_->getPlanningScene(); | |||
} | |||
|
|||
private: | |||
std::shared_ptr<UniqueTimedLock> ulock_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who are we sharing the lock with? Since the class constructs the lock itself and doesn't seem to hand it out at any point, it doesn't have be a shared pointer or even unique pointer in my opinion. It's just normal class member, right?
@@ -959,6 +968,7 @@ bool PlanningSceneMonitor::waitForCurrentRobotState(const ros::Time& t, double w | |||
void PlanningSceneMonitor::lockSceneRead() | |||
{ | |||
scene_update_mutex_.lock_shared(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here? Is this a clang-format switching the type of whitespace or something like that?-)
e1b0ec1
to
fdc3b1c
Compare
This replaces the
shared_mutex
of the PlanningSceneMonitor with ashared_timed_mutex
that allows specifying a timeout for lock attempts. The scoped PS wrappers LockedPlanningSceneRO and LockedPlanningSceneRW now support passing a timeout duration and may fail with a TryLockTimeoutException in case of a failed attempt. This is very useful for making sure the calling thread is not blocked for too long in case of concurrency issues.Additionally, the new scoped lock wrapper TimedLock allows tracking lock sources (=caller names), start time, waiting time and the duration that the lock persists. An optional DiagnosticStatus publisher can be enabled for debugging these lock events at runtime.
The scope-less locking functions (lockSceneRead(), etc...) are deprecated since they are error-prone and don't work well with the diagnostics feature.
My motivation for this was a lengthy debugging session for getting rid of spurious deadlocks and performance issues.
Especially, in scenarios where the planning scene monitor is being used in a time-constrained loop (servo collision checking, online sampling for controller commands) it's critical to understand where time is being spent and what threads are being blocked.
I'm not convinced that the DiagnosticStatus publisher should be merged as is (moveit_profiler would do a similar job), but the timed locking feature seems really useful and safer to me.
Left TODOs: