-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU Usage 100% with MQTTClient async, depending on mosquitto version. #781
Comments
I use this paho.mqtt.c commit: |
Different versions of Mosquitto can also have bugs and behave differently! Then again we shouldn't get 100% CPU being used. We'll need more information. Such as a trace of the client lib when it first happens (see the readme for how). Or a test program plus the version number of Mosquitto which causes a problem for you. These are the sort of questions that I'm wondering about:
|
OK, I understand. Ill do my best. -- 0.15 (this causes the problem) Subscriptions: done one by one in a loop. I need to check, if even the first subscription causes the problem already. All topics subscribed to have retained values. Which my application gets in a callback after subscription (which is what is expected). QoS is always 0. I expect messages sent with 1/minute to 1/second. However, I have tested, that I can also have 10/second or even more, which does not change the bahavior. (Wich is great!) Using mosquitto_sub everything seems OK, CPU-Usage is near 0. |
I have put together the subscriber.c from your examples to produce 100% CPU-Load when connected to the problematic broker. |
And this is the output directly after start and now it waits for the next message... `Subscribing to topic HOME/SOLAR/AGM_VOLT Press Q to quit Message arrived |
Dont be confused: This example obviously uses QoS1, I forgot to change this, but with QoS0 it is the same... (just confirmed). Press Q to quit Message arrived |
Tracing: I have set:
And now the output looks like this: (do we have block-quote here?) `/c/mqtt>./subscriber ========================================================= ========================================================= Press Q to quit 20200101 111738.873 4 ExampleClientSub -> SUBSCRIBE msgid: 1 (0) |
I have now commented the subscriptions, and still CPU-Load is 100%. So the problem starts after connecting already. I now see only the PINGs in the trace.
|
I don't get a 100% CPU usage with the program you attached and Mosquitto 0.15. That Mosquitto version is really old (2012) - I had to download it to try it out. A full trace might give me something more to go on. However, one difference between that version of Mosquitto and the recent versions is that it supports MQTT 3.1 only, not 3.1.1 the standardized version which didn't exist then. The Paho client library tries to connect with 3.1.1 and if that fails, falls back to 3.1. In the trace, that is shown by: 20200101 111738.840 waitfor unexpectedly is NULL for client ExampleClientSub, packet_type 2, timeout 29862 When connecting to a recent version, the first connect will succeed, using 3.1.1. It would be best not to use a broker which doesn't support 3.1.1. Another thing you can try is to use the MQTTVersion connect option. Set it to MQTTVERSION_3_1 to only use MQTT 3.1 and MQTTVERSION_3_1_1 to only use 3.1.1 with no fallback. |
OK, maybe I can try. BTW. The 100% cpu load shown by top or xload is mainly SYS load (following the xosview classification USR/SYS/WIO/IDLE), but I dont know what that means. Maybe the load is produced in some sys function handling the socket. Maybe of importance: The troubling broker is remote and the client is connected via WLAN to the network. But I doubt that it matters, because as said, a connection to another Raspberry Pi running a more recent broker works fine. |
Success: makes a difference. It now works, not showing "waitfor unexpectedly is NULL", and now the CPU-Load is down/normal. It looks like that fixes it. I have found no issues using this old protocol even together with more recent broker version. But will there be disadvantages in doing so? |
MQTT 3.1.1 has been published for >5 years now so it is the most prevalent version by far. There is no real significant difference between 3.1.1 and 3.1 so the most likely effect is that you try to connect to a broker which doesn't support 3.1 any more. If you can take a trace (normal level) as described in the readme of the connect situation when you get 100% CPU usage that might help me work it out. |
OK, I have made now a full trace. Interestingly, I get connection losses with the debug output on, which I do not get with quiet debug. It runs for 5 seconds at 100%cpu, then connection loss and it tries to reconnect. But however, here is a huge log. |
Maybe since this was a slightly different example code used for the log above |
So the title of this issue says 'async client' whereas the example you give uses the synchronous client library (MQTTClient_). Now I assume we are talking about MQTTClient not MQTTAsync? |
Hm, if you are referring the library used, yes. (Looks like I have not understood the difference.) I thought, because I use the callbacks, that was asyncronous. I see, that the concept doesnt allow for polling. I am sorry for that confusion. |
I don't get this with the latest code, with paho_cs_sub or your example. |
Describe the bug
The CPU usage of a paho async-client (subscribing to multiple topics on a remote mosquitto broker) is 100% (from beginning on, after connecting), but only on certain mosquitto broker versions. If the connection is lost, the CPU-load goes down to about 0. After reconnection, it is up again.
So, the same client code works fine when connecting to Machine A with mosquitto broker A, but connecting to another Machine B with another mosquitto broker B, the identical client takes 100% CPU (for one core), but it works.
To Reproduce
I am devellopping "MQTT-Hyperdash" (see https://github.com/kollokollo/MQTT-Hyperdash ), running this on linux machines using different mosquitto brokers on multiple raspberry pis but maybe of different Raspian versions and so different mosquitto versions.
But the problem is on the client side, always reproducible, and this should not be the case. Paho should work identical with any mosquitto version.
Expected behavior
The CPU usage of the async client should be negligible, connectiong to whatever version of the mosquitto broker is used on whatever remote machine.
** Environment (please complete the following information):**
** Context
Machines A B and C running on the same local network. A and B are Raspberry Pi/Raspbian, Machine C is Ubuntu Linux. mosquitto broker running on all machines, The client (with the problem) is running on machine C. The problem occurs when the client connects to A but not when it connects to B. When starting two clients the cpu usage can go up to 200% (2 cores) and so on.
The text was updated successfully, but these errors were encountered: