-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ActionServer & ActionClient Dropped Goals #46
Comments
@Jmeyer1292, I'm running into issues with unacknowledged result messages that leave the action client in a WAITING_FOR_RESULT state. Is this similar to the issues you have observed? If it is, what queue sizes for the client/server seemed to fixed it for you? |
@safrimus Frankly, I set the queue sizes to zero so that they buffer indefinitely. It's kind of a heavy weight solution and may not be appropriate for many users but it has helped me. I added a PR #47 that lets you set these sizes through parameters should you be interested in testing. A better solution would have us look into the action protocol itself to ensure that the server gets an acknowledgement to any Result types it sends. |
+1 for acknowledgement of Result messages. Especially since it's a required message without which the client is essentially stuck waiting for a message that may have simply been "missed". |
Making queue sizes configurable through parameters #47 seems like a good idea. In the past queue sizes were infinite (at least in the Python implementation). This change would need thorough testing before being integrated. |
I can confirm all problems mentioned here, in particular those related to result msgs getting lost, see my post here: answers.ros.org/question/240056/bug-in-ros-extending-simpleactionserver-client-to-allow-rejecting-goals . Configurable queue sizes with a default value of 10 (or better 100) would be great. One question: when I use the default Ubuntu 16.04 ROS packages, how do I get this update? I guess I've 2 possibilities: a) wait until it is merged in the Ubuntu packages (possibly takes too long), and b) patching the sources in my ROS installation. Am I right? Thanks! |
Update: If I increase the queue sizes to 10, 100, or 1000 (or even 0 -> infinite), and I setup a large number (~35) of requesting clients (on a 1 core VM), messages still get lost (causing my clients to timeout)?! Very strange, need to investigate this further. Update 2 (edit): Messages still can get lost regardless of the queue size. Any ideas, why this happens? |
@CodeFinder2 have you resolved this issue? Any updates? We are experiencing similar message loss regardless of the queue size in the ActionServer. The python client does not experience the same behavior, this only happens with the C++ client. |
@CodeFinder2 @progtologist I had a single requesting client but with many goals. The queue size changes did seem to alleviate my issues for this case. There may be another queue issue or race condition w/ many clients. |
@progtologist Unfortunately, I've not solved it yet. And to me, it seems to be a serious problem in the actionlib. Note that all my posts in this issue were/are related to the C++ interface. I've not used the Python variant. I also thought about network buffer sizes being too small so that there is actual packet loss (even though it's TCP), see, e. g., http://stackoverflow.com/questions/7865069/how-to-find-the-socket-buffer-size-of-linux. However, I don't really think that this is likely the reason. (I've not yet tried to increase the buffer size.) I can possibly provide code that allows one to reproduce the issue. However, It also seems to be related to the underlying software/hardware configuration (#cores, #threads, speed, ...), so not sure if it also allows others to reproduce it with my code. If anyone is interessted (and would like to debug / investigate this issue), please let me know. |
Well in our experience, there are some aspects that are affecting this behavior
My colleague @vagvaz has had more hands-on experience on this so I am inviting him in the conversation. |
@CodeFinder2 can you share your code if it is not too complex ? I have a similar issue, but with our complete robotic stack, so it is a bit hard to solve. If you have a smaller example leading to the problem, I would take a look if I am able to reproduce it on my hardware. |
@adegroote Here's the code for testing:
If it prints something like "Some error during the i-th action processing [...]", that indicates an error we are interested in (although it may 'just' be feedback messages which shouldn't get lost, too). For whatever reason, I am not able to find the other code snippets (also showing dropped goals). Sorry for the late response! |
Additionally, here's a This must be serious issue in the (actionlib?) implementation ... |
@CodeFinder2 I can't seem to be able to reproduce this issue using your code, I've set up my environment as following:
but (un)fortunately the client always succeeds. @progtologist @vagvaz were you able to reproduce this issue with @CodeFinder2 's code? What kind of environment have you seen this issue showing up? Do you happen to have a snippet that can show this behavior? Thanks in advance. |
@CodeFinder2 It seems we have experienced similar issue. Hope this could help you. |
Related to #18. I'm using the underlying
ActionClient
andActionServer
classes to implement a multi-threaded motion planning server and I am running into issues with unacknowledged goals and handles waiting for results.Changing the size of all publish/subscribe queues in both classes seems to alleviate the issue. Is there a standard SOP or advice for using the action interface with hundreds or thousands of goals at a given time?
Thanks
The text was updated successfully, but these errors were encountered: