-
Notifications
You must be signed in to change notification settings - Fork 488
Description
Hello,
We are using ZMQ for our server (reactphp/zmq) and our client (zeromq/jeromq). We run multiple instances of the server and each server has several clients connecting to it. We use several other sockets but we are facing an issue with the TCP communication between the PUB socket on the server and the SUB ones on the clients.
The issue is very odd as it is not easy to replicate but from time to time the client stops receiving messages from the server. This happens on all of our server instances and their clients once every 2-3 months but we have a new server instance that replicates this issue in around 20 - 30 minutes. Our servers are "identical" and this points us to a network difference.
We made a simple app that just opens several threads with SUB sockets and listen on the same topic. They all fail eventually but we observed that not all of them fail at the same time. Also, restarting a client solves the issue so we think that the issue is most likely with the client.
The server logs indicate that the send is called on the PUB socket and the listening thread is still running on the client calling recvStr on the SUB socket once every several seconds. This points us to a ZMQ issue but we don't know it for sure.
In any case, right now we are looking into the jeromq implementation but we are wondering if anyone else experienced something like this and maybe has some pointers.
Thank you.