Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite Windows backend to use IOCP exclusively #1269

Merged
merged 30 commits into from
Oct 31, 2019

Conversation

njsmith
Copy link
Member

@njsmith njsmith commented Oct 25, 2019

This dramatically simplifies the code (no more janky background
thread!), and probably makes it substantially faster (though I haven't
measured), since we're not using select anymore or passing events
back and forth between threads.

The way we accomplish this is basically to reimplement select
ourselves, using the undocumented IOCTL_AFD_POLL mechanism to talk
to the kernel directly. Other projects that use a similar approach are
libuv, wepoll, and mio. See #52 for (lots) more discussion.

Since this is undocumented, it's hard to be 100% certain we've
understood everything, but let's see what the tests say :-). I also
added some tests that install "layered service providers", which are
this weird system on Windows that lets random libraries monkeypatch
how sockets work; I think we're doing all the stuff we need to to be
compatible with these things, but again, we need to check that
empirically.

This also unblocks the idea of allowing Trio to run "over" a foreign
event loop, as discussed in #399 (comment)

Also:

- Increase test timeout, because for some reason the "PCTools" LSP is
  sometimes incredibly slow to install.
- Factor out common curl options in ci.sh, for more consistency/reliability
- Run cleanup/codecov upload even if tests failed
As of this commit, you can call
trio._core._generated_io_windows.afd_poll and it seems to basically
work.
This is a nice solid implementation *except* that it turns out I had a
fundamental misunderstanding of how AFD_POLL works, so it all falls
apart as soon as you have multiple tasks waiting on the same socket
simultaneously.
@njsmith njsmith requested a review from oremanj October 25, 2019 09:28
@codecov
Copy link

codecov bot commented Oct 25, 2019

Codecov Report

Merging #1269 into master will increase coverage by 0.11%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1269      +/-   ##
==========================================
+ Coverage   99.57%   99.69%   +0.11%     
==========================================
  Files         106      105       -1     
  Lines       12852    12945      +93     
  Branches      987      997      +10     
==========================================
+ Hits        12797    12905     +108     
+ Misses         35       23      -12     
+ Partials       20       17       -3
Impacted Files Coverage Δ
trio/_core/tests/test_windows.py 100% <ø> (ø) ⬆️
trio/_socket.py 99.58% <100%> (ø) ⬆️
trio/_windows_pipes.py 100% <100%> (ø) ⬆️
trio/_core/tests/test_io.py 100% <100%> (ø) ⬆️
trio/_core/_io_windows.py 98.47% <100%> (+7.39%) ⬆️
trio/_core/_windows_cffi.py 100% <100%> (ø) ⬆️

@njsmith njsmith mentioned this pull request Oct 25, 2019
11 tasks
Copy link
Member

@oremanj oremanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Mostly comments on comments, a few questions about extra checks here and there

trio/_core/_io_windows.py Outdated Show resolved Hide resolved
trio/_core/_io_windows.py Outdated Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/_io_windows.py Show resolved Hide resolved
trio/_core/tests/test_io.py Show resolved Hide resolved
trio/_windows_pipes.py Show resolved Hide resolved
@njsmith
Copy link
Member Author

njsmith commented Oct 27, 2019

OK, I think I've addressed all the comments and all the other cleanups I noticed, so this is ready for review again.

Copy link
Member

@oremanj oremanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two remaining documentation suggestions. Feel free to merge once you've thought about them! 🎉

newsfragments/52.feature.rst Outdated Show resolved Hide resolved
# - number of bytes transferred (named "InternalHigh"); redundant with
# dwNumberOfBytesTransferred *if* this is a regular I/O event.
#
# The general model is that you call some function like ReadFile or WriteFile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating this comment, it's super helpful!

@@ -0,0 +1,173 @@
# A little script to experiment with AFD polling.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the IOCP-based I/O manager implemented in this PR to run into the bug elicited by this script? If not, maybe comment why not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this bug is why we have all this machinery to avoid ever issuing more than one IOCTL_AFD_POLL operation on the same socket at the same time :-). Updated comments in both afd-lab.py and _io_windows.py.

@njsmith
Copy link
Member Author

njsmith commented Oct 30, 2019

Test failure is codecov flakiness (#1231)

@njsmith njsmith closed this Oct 30, 2019
@njsmith njsmith reopened this Oct 30, 2019
@njsmith
Copy link
Member Author

njsmith commented Oct 31, 2019

There was some discussion of scaling problems with CancelIoEx, starting here: #52 (comment)

It looks like the conclusion for now is that this PR is still an improvement over what we had before, but we probably want a follow-up issue to consider splitting up our AFD_POLL operations across multiple AFD helper handles, to improve scalability even more.

I did add the script that demonstrates this to notes-to-self/ in this PR, though.

@njsmith
Copy link
Member Author

njsmith commented Oct 31, 2019

OK, let's see how this goes! Thanks everyone for your comments and feedback so far!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants