-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autobahn WebSocket server #426
Conversation
CI pending ros/rosdistro#21999 |
df72d75
to
699cec6
Compare
Since Autobahn seems also to have CBOR compression and all of that, do we need my pull request to add the support for compression in service calls then? |
This should not affect anything that happens in rosbridge_library, including protocol-level payload compression/encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! Left mostly minor comments.
Is it too much work to add some tests that produce significant load such that we see how the autobahn
server handles closer-to-real-world scenarios? Like publishing bigger point clouds and/or messages at faster rates. There might be some subtle differences in the shift from Tornado to this Twisted-based WS server so it would be good to make sure this doesn't introduce some regressions.
@@ -15,9 +15,7 @@ | |||
<maintainer email="jihoonlee.in@gmail.com">Jihoon Lee</maintainer> | |||
|
|||
<buildtool_depend>catkin</buildtool_depend> | |||
<exec_depend>python-backports.ssl-match-hostname</exec_depend> | |||
<exec_depend>python-tornado</exec_depend> | |||
<exec_depend>python-twisted-core</exec_depend> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Twisted is still directly used, I think it should stay as a non-transitive dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed.
try: | ||
reactor.stop() | ||
except ReactorNotRunning: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps some informative log message could go here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
application.listen(port, address) | ||
rospy.loginfo("Rosbridge WebSocket server started on port %d", port) | ||
listenWS(factory, context_factory) | ||
rospy.loginfo('listening at {}'.format(uri)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we revert the original log format (to be clear it's the rosbridge server listening)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done.
self._valve.set() | ||
|
||
def stopProducing(self): | ||
rospy.loginfo('stop') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More descriptive logging needed here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, this was left here accidentally.
class RosbridgeWebSocket(WebSocketHandler): | ||
@implementer(interfaces.IPushProducer) | ||
class OutgoingValve: | ||
"""Allows the Autobahn transport to pause outgoing messages from rosbridge.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this detect back-pressure and pauses publishing internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added some more description for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a good idea to block a publish because of back pressure by default ? If i have a misbehaving client i may optionally want to drop messages and not make all clients suffer. The caller of this code is blocking in nature afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point that there is potential for one slow client to block writing to others. I consider it a flaw in the rosbridge protocol that topics can be subscribed with (and by default) no protocol-level per-client queue, which would effectively decouple clients from the rospy.Subscriber thread. I do think connecting the backpressure by default is good, because it prevents one slow client from running the server out of memory, which in the context of a robotics framework is terrifying.
Until the protocol is fixed, there are two options:
- Slow client runs server out of memory.
- Slow client blocks other clients subscribed to the same topics.
A third option (really just a mitigation) is to, when subscribing to topics, always set queue_length > 0 to decouple clients from each other. In this case you have to trust the client to not use the default queue_length = 0, but may solve most of your problems as long as you control the application.
Still used directly by the UDP handler.
This was the default in previous versions, but invalid input to Autobahn.
CI now pending ros/rosdistro#22032 because I forgot to add xenial. Melodic tests are green. I've done my own anecdotal testing on a couple of different use cases (point clouds and a very busy graph of small messages across many clients) which led to some fixes. I may add some smoke tests for the server as a first step in this PR, mostly to make sure ssl works as configured. |
We have access to this information in onOpen.
It seems like a huge ABI change which makes me concern about backward compatibility. Tornado has been too long to remove immediately. Instead of replacing, how about making autobahn websocket server as addition to tornado? Then, I would add deprecation sign in tornado server that guides users to use autobahn instead. |
This reverts commit 4224671. This create_url method doesn't exist yet in Ubuntu kinetic.
Avoid network collisions and unnecessary configuration while testing.
I've made the Autobahn server the default and only script entrypoint. The old Tornado handler is still there if anybody was using that interface. The Autobahn server script supports the same features as the Tornado server (and a few more). |
+1 to this PR @jihoonl @mvollrath there are some more issues with the tornado based implementations that needed quite some to track down. Certain clients lik android phones with agressive power saving enabled leasve the websocket in a suspended state hat then chokes up all other clients for messages from the topic thread. |
@dhananjaysathe thanks for your feedback, good to know this is already fixing things for people. 👍 |
Send 100 large messages in each direction to saturate buffers.
and thanks @mvollrath for nice updates. |
@jihoonl thanks for the heads-up. We'll test it in the coming days. |
I would be great of this patch would also be ported to the |
thanks, this is great. the tornado used by rosbridge was getting old and uncompatible with newer tornado versions as used in Jupyter, and websockets were silently failing 👍 |
It would have been great if this breaking change had waited to go into Noetic, or defaulted the other way around to not introduce the breaking change which was warned about above. This breaks compatibility with https://github.com/rctoris/jrosbridge Adding another proxy between the rosbridge and code that was previously working which adds the protocol to the Origin in the header gets around the issue. |
Instead of using obscure threading hackery to add flow control to Tornado, use a grown-up WebSocket server with its own flow control features.