-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion failed in signaler again #3756
Comments
Thanks for reporting this. This is a bit more difficile though. See #2360 and #2362. Probably the loop should be put back in without the immediate assertion However, this doesn't really match your report. You are saying that |
I did some debugging and the following came up:
|
Ah, sorry I missed that |
@vojtech-frodl I hope the new code fixes the regression. I just wonder if you have a test case that reproduces this near-deterministically. Interestingly, I have never seen this happen in the existing tests. While the change may fix the assertion, it might be that there is some underlying performance issue in some use case, which causes the socket to behave like this. It would be great to be able to investigate this further. |
Yes, I can confirm that the regression is fixed now. I have quite a solid repro routine but it's far from a minimal reproducer. I can try to minimize it at the weekend. |
Here's a reproducer.
|
I'm very late to this, but: Before I discovered that this is already fixed in master, I spent a few hours reproducing this issue in 4.3.2. Maybe my work can do some good, if you're still looking for a more reliable reproduction. I have a test case that trips this assertion within 20 seconds or so with 4.3.2 on my system, see https://github.com/barometz/zmq-parallel-nbytes. Built with MSVC 14.2 (VS2019), x64 target. May have to twiddle the tasks/cycles a bit. Update: if necessary, I have a sequence of less horribly parallel tests that fails frequently on my own computer and consistently in CI - I can probably turn that into a better test case for this issue, but it'll take some time. Let me know if that's necessary. |
Issue description
I encountered the same issue as described in #2360 which was solved some time ago by commit bcf7577. This solution was reverted by commit 4f77cfa with a commit message "Removed unreachable code paths". Out of curiosity, I added the while loop again and observed that the issue disappeared. Can you please confirm that commit 4f77cfa does what was intended and doesn't cause regression?
Environment
Minimal test code / Steps to reproduce the issue
The assertion fails in proxy() with sockets A (router) and B (dealer). Socket B connects to a remote location. Socket A is bound to an inproc address and multiple threads connect to it and disconnect from it dynamically. All sockets are created in the thread which uses them.
I noticed that you increase chances of the crash significantly if you first run a loop in a separate thread in which you create a new req socket, connect it to A, destroy the req socket and repeat. Let the thread run for a few seconds and then request some data from the remote location by some other req socket connected to A and the crash is virtually guaranteed.
What's the actual result?
::send() returns SOCKET_ERROR causing the following assertion to fail:
zmq_assert (nbytes == sizeof (dummy));
Callstack:
The text was updated successfully, but these errors were encountered: