-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion failed: _input_stopped (stream_engine.cpp:467) #3937
Comments
Possibly related to #3596 |
Strangely the issue stopped happening with no changes in code. I remember that around that time our administrators reconfigured/updated our company VPN. My guess is that some bug in OpenVPN was the culprit here. |
The issue reappeared on our production environment (Linux). Test environment with slightly less load is not affected (and less connections to other zmq endpoints). The crash happens once or twice per day so it's pretty rare. |
I can reproduce it reliably on current master. Steps:
My initial investigation indicates that when the heartbeat timeout fires The bug is definitively related to #3596 Edit:
In C program the part after (and 5 second later) is never printed. It may be that erlang virtual machine does something nasty that breaks zmq assumptions. Any ideas how I can debug it further @bluca @brettviren? |
Is there any update or plan to fix this issue? |
I'm also experiencing this on 4.3.5 with heartbeats on both sides of pub/sub |
Pubsub socket is unusable with heartbeats (in our case with erlang bindings). Since I opened this issue 4 years ago we disabled heartbeats and instead implemented a custom heartbeat protocol over ordinary pub messages. The pub server will emit heartbeat messages every few seconds. The clients listen for heartbeats and if they don't receive them in time they would close the socket and reconnect. This workaround made the connections much more stable and zmq asserts are no longer crashing our production servers. |
Issue description
Assertion failed: _input_stopped (stream_engine.cpp:467)
Environment
Minimal test code / Steps to reproduce the issue
Issue happens indeterministically so no direct reproduction steps. I have a process with several threads each running one dedicated DEALER socket. Every socket is in the same zmq context and is connected to the same endpoint (ROUTER) over TCP transport and uses asynchronous request-response pattern. The socket threads each call zmq_poll (on that one socket) and alternate zm_send/zmq_recv with ZMQ_DONTWAIT. The endpoint is generally slow at consuming messages and sending responses and the responses may be produced in a different order or not arrive at all. Sockets are created and used only on 1 thread.
What's the actual result? (include assertion message & call stack if applicable)
Creshing thread stack trace
All threads
What's the expected result?
No crash
The text was updated successfully, but these errors were encountered: