-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REQ/REP message ordering looses messages #176
Comments
what operating system? is it the same for btw I think there are two things I would do to improve this test..
still curious what system ur reporting from and what happens in that arrangement w/ TCP? |
This happens on all modes of transport. Re 1: The first message reaches the rep server successfully. So it seems to me that sockets are set up. I was expecting to maybe loose the first message due to setup, but once messages get through, no thing should be lost. What happens here (loosing message 2) is a bug. Re 2: My expectation from the exposed interface ('data' event) is that for every req send I'll get a rep 'data' event. Usually these two will not be in one process as is done here. My setup has two processes and the results are similar. Are you saying that I should wait for 'rep' to receive my send? How can that be done in a two process setup? Usually, when using nanomsg in blocking mode, you'd do a send and then a recv. So nanomsg tracks what send you did last and will only give you the response to that request. I guess what happens here is that sends happen in sequence - and because of the async nature of the node/binding code no recvs are issued for the sends. So nanomsg will not even yield answers that might or might not arrive on the network to the binding code - since another request has been issued in the meantime! This also means that if we go and add delays to the test, we're cheating ourselves out of the problem. I see a solution using sendmsg and recvmsg, and passing the control header around. This is the only thing I see that can work in connection with node's thread model. That would mean that the binding code would pass 'buffer' and 'control' into the 'data' event and that the event itself would have to call sendmsg, passing control back in. Otherwise I fail to see how nanomsg can even work in this context. It's not so much a bug of the underlying lib, it's that we don't provide context for proper operation, IMHO. Does the above make sense to you? (System is OS X and Linux) |
thanks for embellishing on a couple points i skipped in your first message, namely:
based on your test output, to me it looks like recv/REP is only getting called twice, because after the first req.send, the rep has not replied yet; any and all subsequent premature calls to req.send are bound to be dropped or ignored. one way to fix that would be to check for a library error code. If one was set by the C library then we could check for it and emit a javascript stream error. So if such an error was set, it might be associated with send() at that point. Another way to detect this type of program error, if none was set by the library, would be to compare the size of the buffer passed in from JavaScript land against the return value of you're suggesting something else:
this sounds interesting because we still haven't exposed the control headers from libnanomsg and it would be a nice thing to expose. Although I am unconvinced your test case establishes a msg ordering problem in REQ/REP, I'm totally open to adding options to req/rep or more likely to other socket types by way of control msg headers, and think that's a great idea! now answering more points from your last reply:
Yes. 100% you don't have a choice here except to do so.
Wait for the msg from REP! Only then, can you call req.send again. Not before hearing back from REP!
fair enough, forget i said anything about delays in the test, timeouts are dirty escape hatches :) The rest of your comments are drawn based on an expectation of behavior that was not written into the REQ/REP protocol as we use it now. We only expose the simple nn_send/nn_recv calls and there is still no support for send_msg and recv_msg! hint hint: please send us a PR exposing those calls to JavaScript :) Besides that ^^ would still be step 1 in your proposal to modify REQ/REP behavior, which is not a bad idea, despite my small disagreement with the expected behavior you want from REQ/REP in it's current form. |
Fair enough; I see your position and I feel you see mine. I'll try to come up with some actual code to support my position, then we can discuss the details. |
I am trying to send multiple messages through req/rep. This works for simple examples where I just send a single message - but as soon as there is more than one req, I only see some of the reps.
I expect this test to succeed:
Typical output looks like this:
In a nutshell, the 'rep' server only sees messages 1 and 3, answers to both, but since 3 has already been read before sending happens, only that last answer is received by the req client.
I am guessing what happens is that
recv
is called for more than one msg at a time, which makes nanomsg expect answers for only the lastrecv
. The only way of fixing this that I see is to change the interface of the 'data' event to include some origin information (nanomsg control headers) and to be able to then pair up answers to control headers through something likerep.answer(control, data)
.The text was updated successfully, but these errors were encountered: