Design: do we need batched accept / accept_nowait? #14

njsmith · 2017-01-22T13:02:34Z

https://bugs.python.org/issue27906 suggests that it's important to be able to accept multiple connections in a single event loop tick.

It's trivial to implement, too -- just

def accept_nowait(self):
     try:
         return self._sock.accept()
     except BlockingIOError:
         raise _core.WouldBlock from None

But...

I am... not currently able to understand how/why this can help. Consider a simple model where after accepting each connection, we do a fixed bolus of CPU-bound work taking T seconds and then immediately send the response, so each connection is handled in a single loop step. If we accept only 1 connection per loop step, then each loop step takes 1 * T seconds and we handle 1 connection/T seconds on average. If we accept 100 connections per loop step, then each loop step takes 100 * T seconds and we handle 1 connection/T seconds on average.

Did the folks in the bug report above really just need an increased backlog parameter to absorb bursts? Batching accept() certainly increases the effective listen queue depth (basically making it unlimited), but "make your queues unbounded" is not a generically helpful strategy.

The above analysis is simplified in that (a) it ignores other work going on in the system and (b) it assumes each connection triggers a fixed amount of synchronous work to do. If it's wrong it's probably because one of these factors matters somehow. The "other work" part obviously could matter, if the other work is throttlable at the event loop level, in the sense that if loop steps take longer then they actually do less work. Which is not clear here (the whole idea of batching accept is make it not have this property, so if this is how we're designing all our components then it doesn't work...).

I guess one obvious piece of "other work" that scales with the number of passes through the loop is just, loop overhead. One would hope this is not too high, but it is not nothing.

If we do want this, then #13 will want to use it.

The same question probably applies to recvfrom / sendto -- right now a task can only send or receive 1 UDP packet per event loop tick.

The text was updated successfully, but these errors were encountered:

njsmith · 2017-06-29T22:45:04Z

Here's another case where accept_nowait would be useful: with unix domain sockets, it's possible to do no-downtime upgrades by binding the new listening socket to a temporary name, then atomically renaming it over the old listening socket, and then having the old process drain and handle any pending connection attempts.

Draining a listening socket means, handle all currently pending attempts when you know that no new ones will be arriving. That's a pretty natural fit to accept_nowait, and not really doable with blocking accept alone.

This is pretty obscure, but if you need it, you really need it. And who knows, maybe someday some OS will make it possible to do something similar for AF_INET{,6} listening sockets, which would be really handy.

njsmith · 2017-08-03T22:58:10Z

Ah-hah, I thought of a case where batched accept makes a difference:

Say we have a mixed workload, involving a CPU-bound "background task" that regularly sleep(0)s to let other stuff run, plus a server task that accepts new connections. Suppose the CPU bound task sleep(0)s every 20 ms, and that each incoming connection can be completely serviced in 0.1 ms. Then in principle our server can handle 1000 connections per second (albeit at that point the background task doesn't get to run at all), but if we yield after every accept then we end up only servicing 1000 / 20 = 50 connections / second.

I think this situation is kind of problematic all around... it would be difficult to service a connection in 0.1 ms in Python, and only yielding every 20 ms is not so great, and if the goal is that the heavy background task shouldn't interfere with the light connection tasks, then wouldn't the Right solution be to make a better scheduling algorithm (#32) or just assign a low static priority to the background task? But I'm sure this kind of thing happens, at least at times.

I guess that from this point of view, the point of a batched accept is that normally an accept loop is a very light task (it does very little work between calls to accept), so standard round-robin scheduling will tend to implicitly assign it a very low priority (each of its individual time slices end up being very small compared to other tasks, and round-robin says that they all get the same number of time slices). So it's a heuristic way of compensating for a particular failure mode of the not-very-clever round-robin scheduler.

njsmith · 2018-10-08T11:40:30Z

It looks like #636 will solve the API parts of this issue, by moving the Listener abstraction layer higher. So that will mean that SocketListener can do batched accept or not, whichever makes sense, without having to change the user-visible interface.

njsmith · 2020-07-08T01:13:45Z

Something I just realized: at some point we may make cancel_shielded_checkpoint() skip the checkpoint if the task hasn't been running for very long. We've looked at this mostly for efficiency on microbenchmarks (running the scheduler is expensive compared to not running the scheduler :-)), but it would also automatically solve some of these fairness issues with tasks that only do a little bit of work before hitting a checkpoint. E.g., if we made it so that each task with work to do gets a minimum of 10 ms (this is what the Go runtime currently uses for its preemption threshold, see forcePreemptNS here), then that would be enough time to do a ton of accepts on each tick.

Fuyukai · 2022-11-17T16:37:54Z

Note that in the year 2022 we can punt batched accept up into the kernel level with io_uring and IORING_ACCEPT_MULTISHOT, which continuously pushes accepted connections onto the completion queue without needing to do accept().

njsmith added the design discussion label Jan 22, 2017

njsmith mentioned this issue Jan 23, 2017

Design: alternative scheduling models #32

Open

njsmith mentioned this issue Mar 7, 2017

High-level networking interface #73

Closed

njsmith mentioned this issue Jul 10, 2017

Discussion: what's the best strategy for "nowait" and "select" operations? #242

Open

njsmith mentioned this issue Mar 18, 2018

High-level helpers for working with unix domain sockets #279

Open

njsmith mentioned this issue Aug 29, 2018

The Listener interface is at the wrong level of abstraction #636

Open

oremanj added the performance label May 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design: do we need batched accept / accept_nowait? #14

Design: do we need batched accept / accept_nowait? #14

njsmith commented Jan 22, 2017 •

edited

Loading

njsmith commented Jun 29, 2017 •

edited

Loading

njsmith commented Aug 3, 2017

njsmith commented Oct 8, 2018 •

edited

Loading

njsmith commented Jul 8, 2020

Fuyukai commented Nov 17, 2022 •

edited

Loading

Design: do we need batched accept / accept_nowait? #14

Design: do we need batched accept / accept_nowait? #14

Comments

njsmith commented Jan 22, 2017 • edited Loading

njsmith commented Jun 29, 2017 • edited Loading

njsmith commented Aug 3, 2017

njsmith commented Oct 8, 2018 • edited Loading

njsmith commented Jul 8, 2020

Fuyukai commented Nov 17, 2022 • edited Loading

njsmith commented Jan 22, 2017 •

edited

Loading

njsmith commented Jun 29, 2017 •

edited

Loading

njsmith commented Oct 8, 2018 •

edited

Loading

Fuyukai commented Nov 17, 2022 •

edited

Loading