-
-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite Windows backend to use IOCP exclusively #1269
Merged
Merged
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
849055b
On Windows, run tests with LSPs installed
njsmith 22b07e5
Improve notify_closing docs
njsmith c8a63de
Checkpoint: basic AFD support working
njsmith f7ac5aa
Checkpoint: a beautiful but doomed approach
njsmith d17431b
Rewritten and working (?) IOCP-only windows backend
njsmith e877b4b
Merge branch 'master' of github.com:python-trio/trio into deselect
njsmith d73b3cf
Remove debug prints
njsmith c8efc39
yapf
njsmith 3fdfbe9
Remove a bit more debug code
njsmith d6d8fac
Move LSP tests up to the top of the azure pipelines order
njsmith 1a12bcd
Don't use enum.IntFlag on python 3.5
njsmith 8c549f3
Re-run gen_exports.py
njsmith 1a8ecbe
remove more debug code
njsmith 45bd65c
remove stale comment
njsmith 86eb6ef
Add test for how notify_closing handles bad input
njsmith 29b9d30
Add some pragma: no cover to errors we think can't happen
njsmith 630910c
Rename Windows backend statistics attributes to match epoll backend
njsmith 2f1519f
Convert epoll statistics test into generic IO statistics test
njsmith c558574
Add notes-to-self/ to document the weird simultaneous-poll bug
njsmith 0320a8b
Make our set of poll flags more complete
njsmith a28dfa0
Better comments
njsmith 9140391
Let's be paranoid and double-check for weird broken network configs
njsmith c604716
Minor cleanups to test_io_manager_statistics
njsmith e4da787
Add explicit test that wait_* error out properly on invalid values
njsmith b2b84d0
Apparently you get LOCAL_CLOSE notifications whether you want them or…
njsmith 2ce3bcc
Rewrite newsfragment to explain the change better
njsmith f9c3b54
Wording tweak
njsmith 7818f58
Update comments to clarify the impact of the AFD_IOCTL_POLL bug
njsmith ef2d637
Add script to check how wait_readable scales with the number of sockets
njsmith 0b9af3b
Tweak newsfragment again
njsmith File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
On Windows, the `IOCP subsystem | ||
<https://docs.microsoft.com/en-us/windows/win32/fileio/i-o-completion-ports>`__ | ||
is generally the best way to implement async I/O operations – but it's | ||
historically been weak at providing ``select``\-style readiness | ||
notifications, like `trio.hazmat.wait_readable` and | ||
`~trio.hazmat.wait_writable`. We aren't willing to give those up, so | ||
previously Trio's Windows backend used a hybrid of ``select`` + IOCP. | ||
This was complex, slow, and had `limited scalability | ||
<https://github.com/python-trio/trio/issues/3>`__. | ||
|
||
Fortunately, we found a way to implement ``wait_*`` with IOCP, so | ||
Trio's Windows backend has been completely rewritten, and now uses | ||
IOCP exclusively. As a user, the only difference you should notice is | ||
that Trio should now be faster on Windows, and can handle many more | ||
sockets. This also simplified the code internally, which should allow | ||
for more improvements in the future. | ||
|
||
However, this is somewhat experimental, so if you use Windows then | ||
please keep an eye out and let us know if you run into any problems! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
# A little script to experiment with AFD polling. | ||
# | ||
# This cheats and uses a bunch of internal APIs. Don't follow its example. The | ||
# point is just to experiment with random junk that probably won't work, so we | ||
# can figure out what we actually do want to do internally. | ||
|
||
# Currently this demonstrates what seems to be a weird bug in the Windows | ||
# kernel. If you: | ||
# | ||
# 0. Set up a socket so that it's not writable. | ||
# 1. Submit a SEND poll operation. | ||
# 2. Submit a RECEIVE poll operation. | ||
# 3. Send some data through the socket, to trigger the RECEIVE. | ||
# | ||
# ...then the SEND poll operation completes with the RECEIVE flag set. | ||
# | ||
# (This bug is why our Windows backend jumps through hoops to avoid ever | ||
# issuing multiple polls simultaneously for the same socket.) | ||
# | ||
# This script's output on my machine: | ||
# | ||
# -- Iteration start -- | ||
# Starting a poll for <AFDPollFlags.AFD_POLL_SEND: 4> | ||
# Starting a poll for <AFDPollFlags.AFD_POLL_RECEIVE: 1> | ||
# Sending another byte | ||
# Poll for <AFDPollFlags.AFD_POLL_SEND: 4>: got <AFDPollFlags.AFD_POLL_RECEIVE: 1> | ||
# Poll for <AFDPollFlags.AFD_POLL_RECEIVE: 1>: Cancelled() | ||
# -- Iteration start -- | ||
# Starting a poll for <AFDPollFlags.AFD_POLL_SEND: 4> | ||
# Starting a poll for <AFDPollFlags.AFD_POLL_RECEIVE: 1> | ||
# Poll for <AFDPollFlags.AFD_POLL_RECEIVE: 1>: got <AFDPollFlags.AFD_POLL_RECEIVE: 1> Sending another byte | ||
# Poll for <AFDPollFlags.AFD_POLL_SEND: 4>: got <AFDPollFlags.AFD_POLL_RECEIVE: 1> | ||
# | ||
# So what we're seeing is: | ||
# | ||
# On the first iteration, where there's initially no data in the socket, the | ||
# SEND completes with the RECEIVE flag set, and the RECEIVE operation doesn't | ||
# return at all, until we cancel it. | ||
# | ||
# On the second iteration, there's already data sitting in the socket from the | ||
# last loop. This time, the RECEIVE returns immediately with the RECEIVE flag | ||
# set, which makes sense -- when starting a RECEIVE poll, it does an immediate | ||
# check to see if there's data already, and if so it does an early exit. But | ||
# the bizarre thing is, when we then send *another* byte of data, the SEND | ||
# operation wakes up with the RECEIVE flag set. | ||
# | ||
# Why is this bizarre? Let me count the ways: | ||
# | ||
# - The SEND operation should never return RECEIVE. | ||
# | ||
# - If it does insist on returning RECEIVE, it should do it immediately, since | ||
# there is already data to receive. But it doesn't. | ||
# | ||
# - And then when we send data into a socket that already has data in it, that | ||
# shouldn't have any effect at all! But instead it wakes up the SEND. | ||
# | ||
# - Also, the RECEIVE call did an early check for data and exited out | ||
# immediately, without going through the whole "register a callback to | ||
# be notified when data arrives" dance. So even if you do have some bug | ||
# in tracking which operations should be woken on which state transitions, | ||
# there's no reason this operation would even touch that tracking data. Yet, | ||
# if we take out the brief RECEIVE, then the SEND *doesn't* wake up. | ||
# | ||
# - Also, if I move the send() call up above the loop, so that there's already | ||
# data in the socket when we start our first iteration, then you would think | ||
# that would just make the first iteration act like it was the second | ||
# iteration. But it doesn't. Instead it makes all the weird behavior | ||
# disappear entirely. | ||
# | ||
# "What do we know … of the world and the universe about us? Our means of | ||
# receiving impressions are absurdly few, and our notions of surrounding | ||
# objects infinitely narrow. We see things only as we are constructed to see | ||
# them, and can gain no idea of their absolute nature. With five feeble senses | ||
# we pretend to comprehend the boundlessly complex cosmos, yet other beings | ||
# with wider, stronger, or different range of senses might not only see very | ||
# differently the things we see, but might see and study whole worlds of | ||
# matter, energy, and life which lie close at hand yet can never be detected | ||
# with the senses we have." | ||
|
||
import sys | ||
import os.path | ||
sys.path.insert(0, os.path.abspath(os.path.dirname(__file__) + r"\..")) | ||
|
||
import trio | ||
print(trio.__file__) | ||
import trio.testing | ||
import socket | ||
|
||
from trio._core._windows_cffi import ( | ||
ffi, kernel32, AFDPollFlags, IoControlCodes, ErrorCodes | ||
) | ||
from trio._core._io_windows import ( | ||
_get_base_socket, _afd_helper_handle, _check | ||
) | ||
|
||
class AFDLab: | ||
def __init__(self): | ||
self._afd = _afd_helper_handle() | ||
trio.hazmat.register_with_iocp(self._afd) | ||
|
||
async def afd_poll(self, sock, flags, *, exclusive=0): | ||
print(f"Starting a poll for {flags!r}") | ||
lpOverlapped = ffi.new("LPOVERLAPPED") | ||
poll_info = ffi.new("AFD_POLL_INFO *") | ||
poll_info.Timeout = 2**63 - 1 # INT64_MAX | ||
poll_info.NumberOfHandles = 1 | ||
poll_info.Exclusive = exclusive | ||
poll_info.Handles[0].Handle = _get_base_socket(sock) | ||
poll_info.Handles[0].Status = 0 | ||
poll_info.Handles[0].Events = flags | ||
|
||
try: | ||
_check( | ||
kernel32.DeviceIoControl( | ||
self._afd, | ||
IoControlCodes.IOCTL_AFD_POLL, | ||
poll_info, | ||
ffi.sizeof("AFD_POLL_INFO"), | ||
poll_info, | ||
ffi.sizeof("AFD_POLL_INFO"), | ||
ffi.NULL, | ||
lpOverlapped, | ||
) | ||
) | ||
except OSError as exc: | ||
if exc.winerror != ErrorCodes.ERROR_IO_PENDING: # pragma: no cover | ||
raise | ||
|
||
try: | ||
await trio.hazmat.wait_overlapped(self._afd, lpOverlapped) | ||
except: | ||
print(f"Poll for {flags!r}: {sys.exc_info()[1]!r}") | ||
raise | ||
out_flags = AFDPollFlags(poll_info.Handles[0].Events) | ||
print(f"Poll for {flags!r}: got {out_flags!r}") | ||
return out_flags | ||
|
||
|
||
def fill_socket(sock): | ||
try: | ||
while True: | ||
sock.send(b"x" * 65536) | ||
except BlockingIOError: | ||
pass | ||
|
||
|
||
async def main(): | ||
afdlab = AFDLab() | ||
|
||
a, b = socket.socketpair() | ||
a.setblocking(False) | ||
b.setblocking(False) | ||
|
||
fill_socket(a) | ||
|
||
while True: | ||
print("-- Iteration start --") | ||
async with trio.open_nursery() as nursery: | ||
nursery.start_soon( | ||
afdlab.afd_poll, | ||
a, | ||
AFDPollFlags.AFD_POLL_SEND, | ||
) | ||
await trio.sleep(2) | ||
nursery.start_soon( | ||
afdlab.afd_poll, | ||
a, | ||
AFDPollFlags.AFD_POLL_RECEIVE, | ||
) | ||
await trio.sleep(2) | ||
print("Sending another byte") | ||
b.send(b"x") | ||
await trio.sleep(2) | ||
nursery.cancel_scope.cancel() | ||
|
||
trio.run(main) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for the IOCP-based I/O manager implemented in this PR to run into the bug elicited by this script? If not, maybe comment why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this bug is why we have all this machinery to avoid ever issuing more than one IOCTL_AFD_POLL operation on the same socket at the same time :-). Updated comments in both
afd-lab.py
and_io_windows.py
.