Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roscpp: Multiple AsyncSpinner behavior is confusing and precludes introspection #277

Closed
jbohren opened this issue Aug 21, 2013 · 27 comments
Closed

Comments

@jbohren
Copy link
Member

jbohren commented Aug 21, 2013

Currently, if you try to create two AsyncSpinner instances this error is emitted:

ROS_ERROR("MultiThreadeSpinner: You've attempted to call ros::spin "
          "from multiple threads... "
          "but this spinner is already multithreaded.");

https://github.com/ros/ros_comm/blob/groovy-devel/clients/roscpp/src/libros/spinner.cpp#L75

This is normally not a problem, but if two different systems with ROS plugins are trying to create spinners, there's no way to determine whether spinners currently exist nor is there a way to determine the number of spinning threads.

First, it seems like this should be a warning instead of an error. But more importantly, it seems like the AsyncSpinner class can't determine if it wants to be a handle to a global service or not. You can create any number of AsyncSpinner instances, but they all want to use a single static mutex so only the first one is actually going to function properly.

I'd expect that if I create an AsyncSpinner with n threads in one place, and create another AsyncSpinner with m threads in another place, that there should subsequently be n+m threads processing ROS callbacks. This shouldn't be hard to implement without changing the API. Would such a change be welcome?

@jbohren
Copy link
Member Author

jbohren commented Aug 21, 2013

@davetcoleman, @meyerj, This has direct consequences to writing plugins like gazebo_ros and rtt_rosnode and running them at the same time in the same process.

@jbohren
Copy link
Member Author

jbohren commented Aug 21, 2013

A slight modification, instead of n+m always, a user should be able to specify the minimum number of threads, or alternatively add threads, knowing that there might already be several.

@adolfo-rt
Copy link

I've also been recently bitten by this, in a usecase similar to what @jbohren describes. Since the ros wiki entry and the dev documentation don't mention the limitations of the current implementation, it took some digging up (@po1) to figure out what we were doing wrong.

I'd expect that if I create an AsyncSpinner with n threads in one place, and create another AsyncSpinner with m threads in another place, that there should subsequently be n+m threads processing ROS callbacks. This shouldn't be hard to implement without changing the API. Would such a change be welcome?

Yes!, this would also allow to start/stop AsyncSpinners from different threads, correct?.

@po1
Copy link

po1 commented Nov 26, 2013

Yes!, this would also allow to start/stop AsyncSpinners from different threads, correct?.

A bit of background on what Adolfo is asking here: we were thinking of having multiple AsyncSpinners because we want to service multiple, independent CallbackQueues. It makes sense to isolate some functionalities into a separate CallbackQueue, and sometime we only want to service this CallbackQueue when a given set of conditions are met. We tried to start() and stop() an AsyncSpinner depending on those conditions, but could not do it from a different thread than the one that started the main AsyncSpinner. To my knowledge, there is no out-of-the box working solution to implement this behavior.

@meyerj
Copy link
Contributor

meyerj commented Nov 26, 2013

I agree that this problem should be solved in roscpp's AsyncSpinner directly, but there is a very simple workaround using the boost thread library directly that many of the gazebo plugins for ROS implement for years (for a reason I still do not understand):

#include <ros/ros.h>
#include <ros/callback_queue.h>
#include <boost/thread.hpp>

struct Foo {
public:
  Foo() {
    // advertise, subscribe or whatever using the my_queue_ CallbackQueue
    queue_thread_ = boost::thread(boost::bind(&Foo::QueueThread, this));
  }

  ~Foo() {
    rosnode_.shutdown();
    queue_thread_.join();
  }

  void QueueThread()
  {
    static const double timeout = 0.01;
    while (this->rosnode_.ok())
    {
      this->my_queue_.callAvailable(ros::WallDuration(timeout));
    }
  }

private:
  ros::NodeHandle rosnode_;
  ros::CallbackQueue my_queue_;
  boost::thread queue_thread_;
}

(e.g. in https://github.com/ros-simulation/gazebo_ros_pkgs/blob/hydro-devel/gazebo_plugins/src/gazebo_ros_imu.cpp).

Of course this example can simply be extended to service the global queue or by adding start() and stop() methods and a boolean flag to control the queue thread independently from class construction/deconstruction.

@po1
Copy link

po1 commented Nov 26, 2013

I don't know if this will be of interest to any of you, but I put together two extensions to the NodeHandle class. Both of them are meant to be 100% compatible with regular NodeHandles.

  • The first one, SpinningNodeHandle, ensures that a spinner exists. This is meant to be used in contexts where different independent components need ROS functionality, and will avoid the unnecessary creation of many spinner threads and CallbackQueues.
  • The second one, SpinnableNodeHandle, is meant to be used when you want a separate thread, with a separate callbackqueue, that you can also start and stop on demand. In the gist, the copy constructor needs some work but you will get the intent.

I would love receiving any feedback on this proposal.
Have a look: https://gist.github.com/po1/7661181

@jbohren
Copy link
Member Author

jbohren commented Mar 7, 2014

@dirk-thomas What do you think of @po1's rough proposal?

@jbohren
Copy link
Member Author

jbohren commented Mar 7, 2014

@meyerj Do you think we should use the custom-callback-approach in rtt_ros?

@dirk-thomas dirk-thomas added this to the untargeted milestone Mar 7, 2014
@dirk-thomas
Copy link
Member

If the additional spinners and node handles solve a specific need for your use cases please feel free to use them.

But I would hesitate to add them to roscpp without further reasons. I would rather like to see proposals how to modify the existing spinners to fit these mentioned needs instead of writing/adding another one.

@po1
Copy link

po1 commented Mar 7, 2014

I agree with Dirk, it will be better to fix the problem at its source (in a conservative way) rather than extending the API.

I don't have all the items and subtleties of this case in mind anymore, but the main problem seemed to be that many independent components sharing the same binary might want their own async spinner. The problem is that so far, ros::AsyncSpinner keeps a single recursive mutex here, which means that only one thread is allowed to create (or act upon) an AsyncSpinner.

This results in all independent components of a whole (e.g. gazebo plugins) to create their own thread and implement a spinner in it. That is to say, re-implement an AsyncSpinner. This leads to a multitude of threads that only service one callback queue. The purpose of the first part of the gist that I posted 3 months ago was to offer a general-purpose async spinner that every independent component of a whole could use for lightweight, non-critical parts of their API. The second part pretty much re-implements an AsyncSpinner without the global recursive mutex.

Basically, if we could replace the existing ros::AsyncSpinner to not use this global mutex and do something more akin to what I proposed, that would already solve @jbohren's and @adolfo-rt's problem.

As for limiting the amount of threads and serializing more work in a single thread, I don't see any easy fix that would not imply adding a new class somewhere, but I haven't looked very hard.

@jbohren
Copy link
Member Author

jbohren commented Mar 7, 2014

But I would hesitate to add them to roscpp without further reasons. I would rather like to see proposals how to modify the existing spinners to fit these mentioned needs instead of writing/adding another one.

Yeah that makes sense. One thing that would be useful (and not change the current behavior), at least, is just a boolean-valued function which tells you if an AsyncSpinner is already spinning. I could put together a simple PR for that.

@dirk-thomas
Copy link
Member

@jbohren A pull request for that would be very welcome. Thanks.

@jbohren
Copy link
Member Author

jbohren commented Mar 11, 2014

@dirk-thomas I'll put together the PR with the introspection function.

Meanwhile, I was looking at this again, and it's not clear to me why you shouldn't be able to call AsyncSpinner::start() on multiple AsyncSpinners, especially if you use a custom callback queue. It seems like the global spinlock lock should really just be associated with the default callback queue, and if you use a custom callback queue, you should be able to make your own AsyncSpinner.

@paulbovbel
Copy link
Contributor

Just got bit with the issue of instantiating several asyncspinners to service separate callbackqueues, in the context of a gazebo plugin.

@dirk-thomas, any comment on @jbohren's "Meanwhile.."?

@dirk-thomas
Copy link
Member

@paulbovbel Afaik the async spinner is designed in a way that prevents two instances to work in parallel since they both use the same global state. As the comment in the PR #377 clarifies if multiple are run they don't do what you would expect them to - all but one are basically not handling any callbacks.

@jbohren With #377 being implemented quite some time ago can this ticket be closed? Or can you please clarify what this ticket is about (beside the implemented introspection functionality)?

@ehsan-asadi
Copy link

Hi All, sorry if my question is not directly related to this topic.
I have a ROS node that works fine for several minutes and suddenly face a segmentation fault. I am using ros::AsyncSpinner and looking through the gdb descriptions, I am not sure what cause the problem. I would appreciate any help on this. I can provide more information if needed.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbffff700 (LWP 19409)]
0x00007fffec2a02a0 in ros::Subscription::pubUpdate(std::vector<std::string, std::allocatorstd::string > const&) () from /opt/ros/indigo/lib/libroscpp.so

@dirk-thomas
Copy link
Member

@ehsan-asadi If you questions is not directly related to this topic then please ask it on answers.ros.org.

@dirk-thomas dirk-thomas removed this from the untargeted milestone Jul 4, 2016
@dirk-thomas
Copy link
Member

Since there was no feedback in my previous questions in January I assume this ticket has been resolved by #377. If not please comment here with more information.

@rhaschke
Copy link
Contributor

rhaschke commented Aug 4, 2016

Just got bit with the issue of instantiating several AsyncSpinners to service separate CallbackQueues too.
As @jbohren pointed out, there no obvious reason, why it shouldn't be possible to run multiple AsyncSpinners, especially if they use custom callback queues. Even if they operate on the same or the global queue in parallel, I cannot see an issue, because access to items in the callback queue is protected by a mutex.
@dirk-thomas Could you please clarify, which "common global state they use", as you pointed out in a previous comment?

@rhaschke
Copy link
Contributor

rhaschke commented Aug 4, 2016

I've looked up the history of spinner.cpp. The mutex was introduced in 769af28 as a simple mutex and replaced some hours later with a recursive one in 9c5228d.

@dirk-thomas
Copy link
Member

@rhaschke All spinners share the same mutex:

boost::recursive_mutex spinmutex;

@v4hn
Copy link
Contributor

v4hn commented Aug 8, 2016

@dirk-thomas Yes, that's the problem. The question was why spinmutex is there.
Whenever a spinner runs CallbackQueue::callAvailable on its callback queue, the queue is already locked by a mutex.
So as far as I can see spinmutex does not serve a purpose but breaks running AsyncSpinners from different threads.

This issue is not resolved, because it is still (for no reason afaics) not possible to have multiple AsyncSpinners for different CallbackQueues running in parallel.

@dirk-thomas
Copy link
Member

@v4hn This ticket is about "Multiple AsyncSpinner behavior is confusing and precludes introspection" and as a result #377 has been implemented to allow checking if another sync spinner is already running.

Please create a separate ticket if you want to be able to run multiple async spinners in parallel. Please consider to also provide a pull request for that new feature since I will very likely not have time to work on feature development. Thank you.

@rhaschke
Copy link
Contributor

rhaschke commented Aug 9, 2016

@dirk-thomas Could you give a hint, why the mutexes were introduced at all? As pointed out above, they were introduced in 769af28, referring to "very old bug # 1811". However, # 1811 obviously refers to some old bug tracking system. Could you please point me to that?
I will be fine to prepare a PR to allow multiple AsyncSpinners in parallel. However, to do so, I want to understand why this mutex was introduced. I cannot see any reason for it yet.

@dirk-thomas
Copy link
Member

The old issue tracker from Willow Garage times is no more. I don't think that information is available anywhere.

@tfoote
Copy link
Member

tfoote commented Aug 9, 2016

It's been a long time and I wasn't directly involved with the issue. But I believe that the issue with multiple spinners on the same callback queue was that they did not interleave cleanly and you would get unexpected ordering of the callbacks. Possibly even out of order. The mutex to prevent it was because supporting the interaction between spinners was going to need a notably refactor to work as expected.

@rhaschke
Copy link
Contributor

If the issue was only about accessing the callback queue in order, the solution of using a global mutex would be completely wrong. Rather, a mutex local to the callback queue would be required - which is present too (maybe it was introduced later).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants