Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messages are not sent if socket is closed immediately #1264

Closed
kyllingstad opened this issue Nov 25, 2014 · 27 comments
Closed

Messages are not sent if socket is closed immediately #1264

kyllingstad opened this issue Nov 25, 2014 · 27 comments

Comments

@kyllingstad
Copy link

It appears that messages do not get sent if zmq_close() is called immediately after zmq_send(), even when the linger period is left at the default of "infinite". This seems contrary to the documentation, which says:

After interrupting all blocking calls, zmq_ctx_term() shall block until [...] all messages sent by the application with zmq_send() have either been physically transferred to a network peer, or the socket's linger period set with the ZMQ_LINGER socket option has expired.

Here is a program which demonstrates the issue:

#include <assert.h>
#include <stdio.h>
#include <string.h>
#include "zmq.h"

int main(int argc, char** argv)
{
    void* ctx = zmq_ctx_new();
    assert(ctx);
    if (argc > 1) {
        // Send message
        void* sck = zmq_socket(ctx, ZMQ_DEALER);
        assert(sck);
        int rc = zmq_connect(sck, "tcp://localhost:59999");
        assert(rc == 0);
        int len = zmq_send(sck, argv[1], strlen(argv[1]), 0);
        assert(len >= 0);
        zmq_close(sck);
    } else {
        // Receive messages
        void* sck = zmq_socket(ctx, ZMQ_DEALER);
        assert(sck);
        int rc = zmq_bind(sck, "tcp://*:59999");
        assert(rc == 0);
        char buf[256];
        do {
            int len = zmq_recv(sck, &buf, 256, 0);
            assert(len >= 0 && len < 256);
            buf[len] = '\0';
            printf("%s\n", buf);
        } while (buf[0] != 'q');
        zmq_close(sck);
    }
    zmq_ctx_term(ctx);
    return 0;
}

Run the program without any arguments in one terminal, and with a single argument in another, and observe that nothing happens. However, if a wait (either for a certain time period or a user keypress) is inserted after zmq_send() and before the subsequent zmq_close() call, it works as expected (i.e., the messages get printed on the receiving end).

Tested with both ØMQ 4.0.4 and libzmq master branch (2015-06-09), using Visual Studio 2010 and 2013, on Windows 7.

@xaqq
Copy link
Member

xaqq commented Nov 25, 2014

Hello,

I do not manage to reproduce (on Linux, with libzmq master).

@kyllingstad
Copy link
Author

I tried to compile the program above against libzmq master too—still on Win7 with VS2010, though—and then the message doesn't get sent even if I insert a pause. The program should work, though, right?

@lglayal
Copy link

lglayal commented Jan 21, 2015

Hello, same problem using zeromq 3.2.4 on linux CentOS 6.6, Changing LINGER values (to -1,0,1,100) has no effect. Setting ZMMQ_SNDHWM to 1 before sending the last message has no effect. The only 'workaround' I found working is to add an usleep(100000) between zmq_send() and zmq_close().
I modified your test example to send 4 messages and all messages are lost without usleep().
A tcpdump shows there is no message sent to peer, all messages seem to be stuck on client side.
Your example is using tcp endpoints, we have more complex code using inproc, ipcs and tcp endpoints, using DEALER/ROUTER pattern but with the same problem, messages are lost on zmq_close.

@pauceano
Copy link

Any news about this issue? I have same problem in 4.0.5

@lglayal
Copy link

lglayal commented Apr 22, 2015

Hello, for information (someone may find it usefull), we had a problem this morning, still with messages lost but with increased probability depending on CPU topology in VM guests on VMWare.

  • one guest running 8 VCPU, using 1 socket with 8 cores
  • one guest running 8 VCPU, using 8 sockets with one core each : this configuration shows more messages lost than previous one.

Communication occurs between 2 threads using inproc.OS is a linux x86_64, Centos 6.6 using CFS kernel scheduler with default settings.

@kyllingstad
Copy link
Author

The bug exists for Visual Studio 2013 too. (I just tested with the current libzmq master.)

@sorenisanerd
Copy link
Contributor

I believe I've solved this with PR #1511. Can you please check with current master?

@kyllingstad
Copy link
Author

@sorenh:

I believe I've solved this with PR #1511. Can you please check with current master?

I checked just now, and unfortunately, the problem persists.

@sorenisanerd
Copy link
Contributor

@kyllingstad which platform did you test it on? Are you using the script from the bug description to test it?

@kyllingstad
Copy link
Author

Visual Studio 2013 on Windows 7. I am using the same code, yes.

@thehesiod
Copy link

Check fix in #919 however it seems the fix was not complete

@hitstergtd
Copy link
Member

@kyllingstad, @thehesiod - I believe I have reproduced the issue using the test case in this PR, though I merged into one single program. Still seems to happen.

@thehesiod
Copy link

see #1922 for my testcase as well. It's python but I'm sure it's a core bug. Kinda strange how OSX doesn't seem to be affected. Perhaps due to a different default for SNDBUF.

hitstergtd added a commit to hitstergtd/libzmq that referenced this issue May 4, 2016
Solution:
Add it to help narrow down the problem and for making it a permanent part of
the repository, once the issue is solved.
@thehesiod
Copy link

any way we can vote this up? the workarounds are really bad, thanks!

@thehesiod
Copy link

given this may be related to SNDBUF I'm linking this bug (#1922) which did a lot of digging into the default sizes across various platforms.

@omaralvarez
Copy link

Same issue here, Debian 8 with 4.0.5, large messages are not sent if socket is closed quickly. I have double checked LINGER value is -1, is just being ignored.

@garnier-quentin
Copy link

I have the same issue in centos 6.9 and ZMQ 3.1.19. That's really annoying. I have to add a sleep before the close. Otherwise, my other process cannot get the data. It's really simple, i have:

  • one router (process stay alive)
  • some dealers (process send one message (300KB) and the process exited).

I have set the LINGER to 30 seconds but it seems it doesn't work.

@bluca
Copy link
Member

bluca commented May 31, 2017

Those versions are quite old, could you please try with 4.2 and see if you still have issues?

@garnier-quentin
Copy link

I would like to upgrade but i don't find perl binding for zmq4 maintained.

@bluca
Copy link
Member

bluca commented May 31, 2017

I use https://github.com/lestrrat/p5-ZMQ and it's just fine
API level is still 3 so it's compatible

@bluca
Copy link
Member

bluca commented May 31, 2017

There's also this one but I haven't tried it: https://packages.debian.org/stretch/libzmq-ffi-perl

@garnier-quentin
Copy link

Thanks. I'll tried the binding.

@garnier-quentin
Copy link

I have tested with zmq 4.2.1 and it's ok

@bluca
Copy link
Member

bluca commented May 31, 2017

Ok, great, thanks for confirming

@omaralvarez please let us know if 4.2 solves it for you as well - FYI you can get debian 8 packages from our OBS project: http://download.opensuse.org/repositories/network:/messaging:/zeromq:/release-stable/Debian_8.0/

@sigiesec
Copy link
Member

Since there was no activity in the last year, I assume this is resolved now.
Feel free to reopen if the problem persists.

@thehesiod
Copy link

probably a dup of this anyways: #1922

@bajaj689
Copy link

bajaj689 commented Jun 24, 2019

Guys,This might help you. I had faced the same issue in version 4.x.x . ZeroMQ has a socket option by the name ZMQ_SNDHWM. Its default value is 1000. I was testing with load of around 500-900 publisher requests on zeromq subscriber daemon. Publisher socket drops a message if the threshold is reached ( default 1000 ) .After I changed the value to 2000 and above, no messages dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests