-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket monitor on TCP sockets uses internal Pair socket from multiple threads #2158
Comments
Actually, by default zeromq has one IO Thread. The problem is that monitor events comes from Reaper thread (thread responsible for killing threads), IO thread and user thread. @jens-auer Do you want to make a PR that create an option to use radio-dish instead of pair? |
@somdoron I probably have time next week to do a PR, but not this week. I would also like to make zmq_socket_monitor incompatible in this case such that old code would not compile anymore as it would pass the wrong socket type. I would propose to change it to remove the events parameter and use the msg_group instead as it is used by the radio-dish. This would change signature and thus break compilation. |
@somdoron I also have a patch for 4.1.5 which adds a mutex around the monitor socket. Do you want a PR for this? |
Send the PR for master, we can backport it later. On Oct 11, 2016 15:20, "Jens Auer" notifications@github.com wrote:
|
Please don't break existing API. We have a process for this, and first of all it must be declared as deprecated, and then much later we can remove it. Instead of changing the existing function, add a new one with a different name. This way users have time to migrate their code. Also for the mutex, is that around any socket operation? I'm concerned about the performance implications if that's the case |
@bluca The performance penalty is only when using monitor. It shouldnt be big one as monitored events do not happen alot. Anyway I prefer to use new socket type option, however we can only use it for 4.2 |
@bluca I'm ok with a new function. Any idea for a name? As Doron said the mutex is only around access to the monitor socket and the integer storing the monitored event mask. An alternative design for 4.1 could use PUB/SUB sockets. Each engine creates a PUB socket which sends events to a SUB. The SUB could be in the socket_base and then it can forward the events to the PAIR which would be fully compatible with the existing API. It is also possible to change this more aggressively and let the SUB socket be created by the application code and just pass the address, similar to the existing mechanism with the pair. This would work on 4.1 and 4.2, but then use PUB/SUB on 4.2 which if I understand correctly should be replaced with Radio/Dish in the long run. |
Ah if the mutex is only for the monitor events then no worries :-) For the function name, perhaps "zmq_socket_monitor_group" ? For backporting, we are preparing to release 4.2, so if it becomes too complicated I would say just don't bother with 4.1 and 4.0. If someone faces the issue there they can upgrade. Given we are not breaking API (and ABI still not certain) it will be at most a rebuild. |
@bluca I have a PR (#2159) for 4.2, and I can submit one for 4.1 if wanted because I am going to patch 4.1 locally. Just take a look. Changing the socket type to Radio/Dish or Pub/Sub will make this change obsolete. I probably have time to do it next week and it shouldn't be a big change. I am a bit unsure which socket to use. Radio/Dish seems to be the long-term solution, but Pub/Sub can be backported to 4.1. I think I would only implement this for 4.2 since 4.1 is fixed with the mutex and the new socket type only adds new behavior which can replace the old implementation. And then Radio/Dish because it is the long-term solution? |
This was fixed by #2159, so I am closing this. If there are still issues, feel free to reopen. |
When running a zmq_socket_monitor on TCP connections events are sent from multiple IO threads concurrently on the ZMQ_PAIR socket. This violates the single-thread usage assumption of ZeroMQ sockets and can crash the application (or worse continue with corrupted monitor data or corrupted socket states).
Each stream_engine instance calls socket_base_t::monitor_event to send monitor events on the ZMQ_PAIR socket. This happens in the thread context of the stream_engine and there is no protection around the Pair socket, so it can be accessed from multiple threads.
The text was updated successfully, but these errors were encountered: