-
-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work around Windows kernel bad asymptotic behavior when there are too many operations outstanding on the same handle #1280
Comments
I just realized also: it's possible that poll operations on different handles don't interact with each other, in which case we might be able to simplify the code a lot by using separate handles for read polls and write polls. Should check this. |
...so I just modified my little test script in According to @piscisaureus in #52 (comment), this doesn't happen if you also use different completion ports for the two handles (what the heck Windows why does that matter none of this makes any sense). But that's no use to us, because a single event loop has to funnel everything through a single completion port (unless we want to use threads but that's worse than what we're doing now). |
For reference, here's how I modified the script to use separate helper handles (and also double-check I wasn't missing any errors): modified notes-to-self/afd-lab.py
@@ -132,7 +132,7 @@ class AFDLab:
print(f"Poll for {flags!r}: {sys.exc_info()[1]!r}")
raise
out_flags = AFDPollFlags(poll_info.Handles[0].Events)
- print(f"Poll for {flags!r}: got {out_flags!r}")
+ print(f"Poll for {flags!r}: got {out_flags!r} ({lpOverlapped.Internal})")
return out_flags
@@ -145,7 +145,8 @@ def fill_socket(sock):
async def main():
- afdlab = AFDLab()
+ afdlab1 = AFDLab()
+ afdlab2 = AFDLab()
a, b = socket.socketpair()
a.setblocking(False)
@@ -157,13 +158,13 @@ async def main():
print("-- Iteration start --")
async with trio.open_nursery() as nursery:
nursery.start_soon(
- afdlab.afd_poll,
+ afdlab1.afd_poll,
a,
AFDPollFlags.AFD_POLL_SEND,
)
await trio.sleep(2)
nursery.start_soon(
- afdlab.afd_poll,
+ afdlab2.afd_poll,
a,
AFDPollFlags.AFD_POLL_RECEIVE,
) |
I also just double-checked, and confirmed that if you use separate IOCP completion ports, then everything works fine, which matches Bert's experience. Man, I do not understand Windows at all.
|
Yeah I have tried that too at some point. Just like yourself, I found that it didn't work. |
So as discussed over here, apparently the
CancelIoEx
operation on Windows becomes super-slow when you have a lot of operations outstanding on the same handle. And #1269 makes it so that everywait_readable
/wait_writable
call issues an operation on the same handle. So cancelling a bunch of these calls becomes like O(n^3) or something.And we also issue
CancelIoEx
whenever we transition from 1->2 or 2->1 waiters on the same socket, so this can also be hit during normal usage, without Trio-level cancellation. (I think a typical case would be when you have a socket where one task is blocked for a long time inwait_readable
, while another task is sending as fast as it can, so it keeps going in and out ofwait_writable
.)The overhead isn't really noticeable up to like 1000 sockets or so, but eventually it dominates: at 10k sockets, cancelling a
wait_readable
is like 10x slower, and at 30k sockets, it's like 100x slower, getting into multiple ms per call.@piscisaureus is the one who pointed this out, and he says that he's found that the cost is based on the number of AFD_POLL operations issued per AFD helper handle. So as a workaround, we could open multiple AFD helper handles, and spread our AFD_POLL operations around over them. This will involve some annoying bookkeeping, but it shouldn't be too bad – basically we'll want to keep a pool of AFD handles, and for each one set a limit on how many total simultaneous operations we'll allow. We'll keep a pool of handles that are below their limit, and pull from there if we can; and if we run out, we'll allocate a new handle.
There's a script in
notes-to-self/socket-scaling.py
to measure the problem, and that can be used to validate our fix.The text was updated successfully, but these errors were encountered: