-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High latency (~40 ms) when sending successive messages in a row - missing TCP_NODELAY #433
Comments
It turned out that it is far less mysterious than I thought: The delay can be fixed by activating the TCP option After I added this to the code the delays were gone:
Would it maybe be an option for mosquitto to make this socket option configurable? This is how emqttd does it: |
I'm also getting 40ms in simple cases where 0ms would be expected. |
@ralight I'm collecting benchmark results from various MQTT brokers and Mosquitto performs good for large problems with many publishes but for small problems (the most common ones) I get 40ms for every case no matter how small I make it (25 subs, 1 pub takes 40ms which is the worst result I have seen for this small problem). I want to give Mosquitto a fair chance here and since it performs the best at larger problems I think this is something you should look at. If it is like @verybadsoldier says, no_delay then that should probably be fixed so that I can get accurate results and others can benefit from a faster broker. |
I'm quite conflicted here. The operation as it is at the moment is exactly the correct behaviour. Nagle exists for a reason, we don't want to generate lots of small packets. Disabling Nagle can give you a reduction in latency, but will likely also introduce a reduction in throughput. I personally think that throughput is more important for the broker than latency, but that is of course entirely dependent on your workload and model. I'll see about adding the option, with appropriate guidance. |
Well NO_DELAY exists for a reason, too. All other brokers I have tested have a 0ms latency at small broadcasts, dxcept for HiveMQ which has a 37ms delay. |
I'm facing this exact issue and totally support the suggestion of making it possible to control |
@ralight What kind of guidance do you think of in order to add this option? |
Would still be awesome having an option for NO_DELAY. Anything we can do to help with? |
I have mosquitto installed via package on ubuntu 17.10. I start a tmux session and open two panes, one above the other. In the top pane, I run "mqtt sub -h ha '#' -v In the bottom pane, I run "mqtt pub -t 'test/test' -m 'test'" The mqtt command takes about 1 second before I get a new prompt. In my application using mosquitto as the broker, it takes a LONG LONG time to receive a couple dozen messages on a similar number of topics. LONG LONG time is pushing a minute, or maybe 30 seconds. I didn't bother to time it. My application is an IoT one. There are sporadic very short messages - like "light switch turned on". Topics are < 32 characters in length, messages similar. Is the solution to find a different broker? No offense, but this ticket is 8 months old at this point. Nagle is awful for network applications other than TTY kinds of things. There should be an option in the config file to set TCP_NODELAY at least. Or you can use TCP_CORK and similar strategies when you know you have multiple messages queued up. The fix seems simple enough. Parse the config for disable_nagle (or whatever you choose) and add the one line of code to setsockopt() turn it off if found in the config. This would allow people to tune their broker for their purposes. Thanks for the software, it is feature rich. And thanks for reading. |
Are you sure you don't have problems from your "mqtt pub" and "mqtt sub" commands? mosquitto_pub /mosquitto_sub don't show the behaviour you're describing at all. (yes, tcp_nodelay is somethign to talk about, but your description sounds way different) |
My network is gigabit ethernet. I have sub 1ms ping times between machines. MQTT is running on a Celeron class machine, but I don't see that it should make that much of a performance penalty, latency-wise. The mqtt command seems to be verifying/waiting for the broker to receive the message. It's not startup time for the command as far as I can see/tell. The latency issue is quite clear in my app. For the weather tile, there's maybe 3 MQTT messages. For the thermostat, similar. For the pool and spa tiles, about 5 each. For the TV, 5. For the fans, lights, etc, 2 each. Devices over wifi take many seconds to receive all those messages. |
Note: those messages are mostly persisted ones, not new ones sent. Some might be new, but that would be a race condition of sorts, where new state messages are sent at the same time the client is subscribing to those topics. |
I would recommend emqtt: http://emqtt.io/ it scores very good in latency and overall throughput in my benchmarks. |
@mschwartz It's basically the latest official |
Thanks for the tips. If I wanted to switch to another broker, I'd look at Mosca first. I looked at emqtt, but the Erlang bits seem like a bunch of bloat I don't want. Maybe as a docker container, though! I could add the tcp_nodelay myself. I'm quite good at C/C++. My preference is to see it done right so it propagates via package maintenance processes. |
zZZzzZz... just do the damn benchmarking and keep those nonsense gut feeling rants to yourself @mschwartz Mosca is the slowest, by far, broker available. Jeeez! |
My application is multiple micro services written for node.js. Mosca would at least be consistent with the tools I'm using. The issue is not throughput (e.g. how many messages/second), but latency. It matters not if the code is C or interpreted Basic if there's a long delay caused by Nagle. |
Could you try out the code in the develop branch please? You'll need to set the set_tcp_nodelay option. |
Sure. I might be able to get to it this weekend. I was considering moving the broker to a much beefier machine. That might make the persistence faster (very fast m.2 drive, fast I7 CPU, etc.), at least. |
I like the code in your last commit! |
I still maintain that with the delays you claim to be seeing, nagle wasn't the issue :) With the delays you claim to be seeing, no-one would ever have even started using mosquitto. |
Thank you @ralight, works for me! Same test as in the initial post:
Now I am seeing a latency between broker receiving the message from the sender and broker sending the message to the client of <100 us, nice! Thanks! |
|
@alexhultman
|
Indeed I was, I'm sorry if that wasn't clear. |
Latency I saw was 1 second, not one minute. For one message to be sent localhost and received/printed by a 2nd process on the host. Some of that might be the mqtt command to load from "disk" and start running, but visually, the latency is ~1 second. 100 ms latency seems high to me, still. I see times of 6 ms just to fetch the HTML for my application - that's from a node.js / express..js server. I don't think people are attracted to Socket.io for performance reasons. What's impressive is that you can get 10's of thousands of simultaneous connections on one server, with decent performance. It also transparently resorts to methods other than WebSocket to emulate WebSocket, where needed. Methods like long polling, for example. |
I really think you should test it if disabling nagle does fix it for you or not. |
I plan to this weekend. I have a day job to focus on until then. |
I haven't looked at the code much, but it might be sped up by deferring the writing of persisted data to backing store in a separate thread. |
You were writing messages to store in your js on message handler? and you're blaming nagle? Try a demo like this: you might see nagle, but you won't be seeing anything like the order of seconds: https://gist.github.com/karlp/a2d50d0bd49313d617c9343b41caad73 |
I am not writing messages to store in javascript. I was talking about mosquitto. It has to write to its backing store. |
@alexhultman sent me a very rude email. Unsolicited. What a classy guy. |
Um, GitHub sent you a notification because I wrote in this thread. Now, stop tagging me @. |
GitHub will send you (and everyone in this thread) the same mail. Everyone here got that mail already. Don't see why you are acting like such a pussy there is a reason I removed it because it wasn't adding anything to this thread even though I 100% feel that you indeed are a true moron. |
And you are a child. I am done here. You fucked things up for the guy who wrote this software. |
Pleasure is all mine. |
I stopped mosquitto, installed mosca in its place and the latency is near 0ms. The console in dev tools prints a couple dozen lines within 1 second. The pleasure is all mine. Good luck @ralight. |
You're welcome. |
I am experiencing quite a high latency when sending two messages in a row (~40 ms). The client receives the first message near instantly but the second message is delayed in the broker. A single message is fine. This behavior is 100% reproducible for me.
I tried several things to figure where the delay happens. I hope I am not barking up the wrong tree here. I did the same test with emqttd and there is no delay at all.
Setup:
192.168.2.33
on Ubuntu 16.04.2 LTS in a VM hosted on Windows 10 on an Intel i5.192.168.2.230
MQTT.fx 1.3.1 on Windows 10192.168.2.36
Ubuntu using mosquitto_publatency
, Client 2 publishes messages to topiclatency
This is how the network traffic does look in ngrep when publishing two successive messages to one topic:
So the message
latency1
published by client 192.168.2.36 arrives at the broker at17:20:34.695801
and is published to client 192.168.2.23017:20:34.695843
. So only 42 us later.Then message
latency2
comes in at17:20:34.699322
but this time the broker publishes it at17:20:34.736158
to the client. So ~35 ms later. I would expect the same low latency as for the first message.I think that message2 is held back within the broker machine until the TCP-ACK packet for message1 is received, which is quite strange IMO.
I am reproducing this as follows:
mosquitto_pub -h 192.168.2.33 -t latency -m 1; mosquitto_pub -h 192.168.2.33 -t latency -m 2
ngrep latency -q -t
I can see using
strace
with commandsudo strace -p 80684 -e read,write -r
that the process mosquitto is writing the network message to the socket as soon as it read the message from the incoming socket:The lines in question:
and
Only a few microseconds between reading and writing the data which seems great.
But when using Wireshark to analyze whats happening on the TCP stack we can see that message1 arrives at the broker and gets published to client2 very quickly. Then message2 arrives but it is not published to client2 immediately. It only gets published right after the ACK packet from client2 for message1 arrives at the broker:
data:image/s3,"s3://crabby-images/347a2/347a2983ec3ea2b53518c4429c3d0046c287898c" alt="image"
So is this an expected behavior with mosquitto? For me it seems like Linux TCP stack is holding the second packet back which would not be a mosquitto issue then, but as I said when I try with emqttd then everything is fine.
The text was updated successfully, but these errors were encountered: