mosquitto_loop_stop sometimes does not join the spawned thread #3207
Labels
Component: libmosquitto
Status: Accepted
It's clear what the subject of the issue is about, and what the resolution should be.
Hello,
EDIT: I've just noticed that I overlooked #2905 and #2377 which reports the same cases as one of the cases below where
mosquitto_loop_stop
can sometimes returnMOSQ_ERR_INVAL
In an application that uses mosquitto's threaded client interface (Linux with pthreads), I've noticed
mosquitto_loop_stop
can sometimes returnMOSQ_ERR_INVAL
after callingmosquitto_disconnect
andmosquitto_loop_stop
in order, in which case the spawned thread is not joined and in an application that does issues manymosquitto_loop_start
andmosquitto_loop_stop
s in its lifetime this ends up in memory leaks accumulating over time.Not being familiar with the internals, I took a briefish look at it (with some additional insight from a colleague who is more familiar with mosquitto), and noticed
mosquitto_loop_stop
is probably prone to racy behavior and the change in 0d1837e that addressed #2242 is relevant.From what I figured, the thread exit is initiated by
mosquitto_disconnect
, and while exiting inmosquitto__thread_main
the spawned thread sets the handle'sthreaded
tomosq_ts_none
. Meanwhile in the other thread,mosquitto_loop_stop
checks formosq->threaded != mosq_ts_self
. From what I can tell, if the spawned thread exits before this checkmosquitto_loop_stop
will always returnMOSQ_ERR_INVAL
and not join the thread, which will always result in memory leaks. Which is triggered by the race situation from callingdisconnect
and thenstop
.While looking into this in a test application, I've ran into a another case where
mosquitto_loop_stop
returnsMOSQ_ERR_INVAL
. In a test application (added below) that does the below in a loop:mosquitto_loop_start
mosquitto_connect_async
mosquitto_disconnect
mosquitto_loop_stop
In rare cases
mosquitto_disconnect
can returnMOSQ_ERR_NO_CONN
where the thread is already exited andmosquitto_loop_stop
once again returnsMOSQ_ERR_INVAL
and the thread is not joined and the resources are leaked. I couldn't figure out whymosquitto_disconnect
returnsMOSQ_ERR_NO_CONN
in this case.I've managed to reproduce both cases on both the current master c85313d and the latest tagged release v2.0.20 on Ubuntu 22.04(gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0).
I built mosquitto with the cmake options:
After which I start the broker with pretty much the default config
With the broker running in the background. Running the below test application with
stdout
pointed to null with> /dev/null
, instderr
, in error cases it can either printFailed to stop loop case 1, disc: The client is not currently connected., stop: Invalid arguments provided.
orFailed to stop loop case 2, stop: Invalid arguments provided.
. Case 1, disconnect returns NO_CONN, is a lot rare but it does seem to occur in my environment.I'm sorry if this was a bit too wordy. Thank you for the project!
The text was updated successfully, but these errors were encountered: