Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readable readiness notified only once when counterpart sends two messages in a row (on Windows) #365

Closed
blabaere opened this issue Feb 25, 2016 · 10 comments
Labels
bug windows Related to the Windows OS.
Milestone

Comments

@blabaere
Copy link

Hello,

I'm currently trying to write a rust version of nanomsg. While the behavior of mio on Windows required only a small modification, I'm left with one problem in the following TCP client/server scenario.

Client socket (no problems here):

  • the socket sends 11 bytes (8 bytes of prefix to tell the size, and 3 bytes of payload).
  • the socket sends 11 bytes again.

Server socket:

  • the event loop notifies the acceptor is readable, a stream is accepted.
  • the accepted stream is registered with events:all, poll:edge.
  • handshake are exchanged between the server and the client (hopefully not relevant to the problem).
  • the stream is reregistered with events:all, poll:edge.
  • the client sends its two messages.
  • the event loop notifies the stream is readable
  • the stream token is added to the list of readable streams.
  • a few millis later
  • the test tells the event loop via its channel that it should read a message
  • stream::try_recv tells to try again. I understand this is how Windows IOCP is used by mio, so it's OK.
  • the event loop notifies the stream is readable
  • stream::try_recv reads 8 bytes of prefix
  • stream::try_recv reads 3 bytes of payload
  • the test is happy with its received message
  • the stream is reregistered with events:all, poll:edge.
  • the event loop notifies the stream is writable. *This is where the difference with nix is a problem.
  • the stream is removed from the list of readable streams.
  • a few millis later
  • the test tells the event loop via its channel that it should read another message.
  • Since there is nobody in the list of readable streams, an internal timeout is reached, and the test fails.

My uneducated guess is that the two messages end up in one single buffer (OS TCP layer or mio Windows layer), raising one notification. That buffer is then discarded, since it has been presented to the user with the opportunity to read of all it.

In case this description is not clear enough, I'm currently writing a piece of code small enough to illustrate the scenario, I'll let you know when it is complete.

Regards,
Benoît

@rrichardson
Copy link
Contributor

Sounds like you're getting hit by Nagle's algorithm. What happens when you send 1000 bytes from the client?

(My hunch is that the client tcp send is buffering the data until it achives a maximum, or a period of time passes)

@blabaere
Copy link
Author

  • The test passes on Linux.
  • Sending a 4K message instead of 11 bytes does not fix the problem.
  • Using TcpStream::set_nodelay(true) does not fix the problem.

It seems the Nagle's algorithm is not guilty of this one.

@rrichardson
Copy link
Contributor

Great. Thanks for checking. :)

@blabaere
Copy link
Author

I have setup a repo with a unit test that can reproduce the problem, a travis build that works fine and an appveyor build that is stuck, waiting for a readable notification so it can get its 11 bytes .

I guess the logs are showing what happens.

@blabaere blabaere changed the title Readable readiness notified only once when counterpart sends two messages in a row Readable readiness notified only once when counterpart sends two messages in a row (on Windows) Mar 1, 2016
@carllerche
Copy link
Member

Thanks for the report. I'll take a look

@carllerche carllerche added bug windows Related to the Windows OS. labels Mar 1, 2016
@carllerche carllerche added this to the v1.0 milestone Mar 1, 2016
@blabaere
Copy link
Author

blabaere commented Mar 1, 2016

Breaking news !!!

I committed some doc comments in scaproust a couple of hours ago, and for some yet unknown reason, the windows build has passed.
So out of curiosity I forced a build of the project I had created to reproduce the problem and it passed too.
Just to be sure, I modified the dedicated test to send three messages in a row instead of just two.
And it passed, just a few minutes ago.

From the cargo output:
the failing build was using https://github.com/carllerche/mio#abd5dd35
the successful build was using https://github.com/carllerche/mio#0a808942

@carllerche
Copy link
Member

@blabaere Is it possible that the original build was before this commit? 0a80894

I fixed many windows bugs, though I am not 100% sure when the fix was merged... looks like ~4 days ago, so it is possible that I have already fixed the bug :)

@blabaere
Copy link
Author

blabaere commented Mar 1, 2016

The failed build happened 5 days ago, the merge commit for #356 occurred 4 days ago, so I guess the merge fixed many windows bugs, including this one. Thanks a lot !!!

@carllerche
Copy link
Member

Great! I am closing this issue then. Feel free to open new ones for any further bugs discovered. Thanks.

@blabaere
Copy link
Author

Hello,

Any chance this one could make it into v0.6 ?
I've been using the master with this fix for several weeks and no other problem showed up since.
From the opened issues, I've got the feeling that v1.0 will not be ready in the next few days, and from #360 I'm pretty sure the upgrade will require quite some work.

Regards,
Benoît

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug windows Related to the Windows OS.
Projects
None yet
Development

No branches or pull requests

3 participants