Unsafe concurrent writes #196

thedevop · 2023-03-24T17:52:37Z

Resurfacing this issue.

Currently, both the client read/write Go routine may call cl.WritePacket:
https://github.com/mochi-co/mqtt/blob/7bd7bd5087c40f96015a61d01ce7ae4e6951c807/clients.go#L481

An example of when this can occur:

Broker starts to publish messages to the client from the WriteLoop
The read Go routine acks PingReq (or QoS 2 packet)

Possible solutions include:

Add a client lock before nb.WriteTo(cl.Net.Conn)
https://github.com/mochi-co/mqtt/blob/7bd7bd5087c40f96015a61d01ce7ae4e6951c807/clients.go#L554
Send all writes to the WriteLoop, but caller will no longer receive write error without refactor

mochi-co · 2023-04-21T21:20:18Z

Thanks for identifying this @thedevop! Apologies again for the delay.

With 1. we potentially introduce lock contention, but with 2. we lose the ability to prioritise control messages (if memory serves, the spec indicates that certain messages should not be delayed based on the order of transmission of publish packets).

is probably the most straightforward solution. If I recall, you are running at a pretty good scale - have you performed any benchmarks on adding a lock here to see if it makes any significant difference?

thedevop · 2023-04-23T01:43:54Z

@mochi-co ,

The use case I have is a bit different, currently it's mainly one way communication from client -> broker (will change in the future). Messages from broker to client are mostly control messages. The environment handles very large number of concurrent clients, but limited amount of message per client. The bottlenecks I see so far are (in order):

AWS security group connection tracking limit. This limited how many concurrent clients a node can handle.
Memory - using C (M is on the edge) instance type will bump this to 1.
CPU - With the exception during connection thundering herd, there are excess CPU capacity.

Logically, since each client only has 2 threads: read and write, the client also has buffered channel to handle publishing, there will not be any cross-client lock contentions. Then the only time lock contention occurs is between read/write thread, as intended.

mochi-co · 2023-05-04T23:52:04Z

I opted for solution 1 as it was simpler - this should be fixed in v2.2.8 👍🏻

mochi-co added the bug Something isn't working label Apr 17, 2023

thedevop mentioned this issue Apr 23, 2023

Unsafe close of outbound channel #202

Closed

mochi-co mentioned this issue May 4, 2023

Add lock to client writes #212

Merged

mochi-co self-assigned this May 4, 2023

mochi-co closed this as completed May 4, 2023

thedevop mentioned this issue May 8, 2023

Minimize client lock duration #223

Merged

thedevop mentioned this issue May 20, 2023

Panic: concurrent write to websocket connection #232

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsafe concurrent writes #196

Unsafe concurrent writes #196

thedevop commented Mar 24, 2023 •

edited

Loading

mochi-co commented Apr 21, 2023

thedevop commented Apr 23, 2023 •

edited

Loading

mochi-co commented May 4, 2023

Unsafe concurrent writes #196

Unsafe concurrent writes #196

Comments

thedevop commented Mar 24, 2023 • edited Loading

mochi-co commented Apr 21, 2023

thedevop commented Apr 23, 2023 • edited Loading

mochi-co commented May 4, 2023

thedevop commented Mar 24, 2023 •

edited

Loading

thedevop commented Apr 23, 2023 •

edited

Loading