Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steal anyio's create_tcp_server #982

Closed
smurfix opened this issue Mar 20, 2019 · 3 comments
Closed

Steal anyio's create_tcp_server #982

smurfix opened this issue Mar 20, 2019 · 3 comments

Comments

@smurfix
Copy link
Contributor

smurfix commented Mar 20, 2019

anyio's TCP server wrapper is used like this:

            async with await anyio.create_tcp_server(**args) as server:
                async for client in server.accept_connections():
                    await do_whatever_with(client)

IMHO this is a more trio-ish interface than trio.serve_tcp because you instead of passing what amounts to a callback to something that doesn't return, you have clear flow of control and can start a new thread (or not) on your own, which is a good idea when you need to limit the server's connection rate.

anyio's implementation isn't as nice as we'd like because it only uses one socket instead of binding to every address the parameters tell it to, but that should be fixable.

@njsmith
Copy link
Member

njsmith commented Mar 21, 2019

That's basically trio's Listener API. Currently, serve_tcp is just a shorthand for doing open_tcp_listeners (which is equivalent to anyio's create_tcp_server), and then serve_listeners, which has the accept loop logic. If you really want to write an accept loop by hand, you can certainly do that.

That said, I think this design is actually a mistake, and we should refactor the Listener interface so that it does encapsulate the accept loop. (See #636.) So even though your critique doesn't really apply to trio currently, it will apply if we make that change, so we should think it through :-).

anyio's implementation isn't as nice as we'd like because it only uses one socket instead of binding to every address the parameters tell it to, but that should be fixable.

This is harder to fix than you might think! It's non-trivial to define the operation "accept a connection from exactly one of several listening sockets, whichever is available first". It requires an internal buffer holding connections that have been accepted but not yet passed to the user, or making accept an atomic operation that's selectable in the concurrent ML sense, or something. (Note that if you're using IOCP accept on Windows then the latter option is impossible.)

This is also why trio has this awkward convention of passing around lists-of-Listeners. With trio's current Listener interface, it's not really possible to compose multiple Listener objects together into a single multi-Listener.

The other reasons why I'm not too excited about telling people to write their own accept loops are:

  • Doing it by hand every time is boring. Maybe it has pedagogical value the first time, but if we don't provide a standard implementation then everyone's going to end up with their own versions in utility libraries. AFAICT it's exactly the same code every time.

  • Writing an accept loop properly is trickier than you might think. For example, serve_tcp wraps the handle in a wrapper function that ensures aclose_forcefully is called on the connection after the handler exits, so you can't accidentally leak file descriptors.

  • There are interesting features that can't be implemented in trio Listeners currently, but could be if we changed the interface to encapsulate the accept loop. Discussion: what's the best strategy for "nowait" and "select" operations? #242 has details.

I guess the counter-argument would be if you need to customize the actual accept loop somehow, e.g.

you need to limit the server's connection rate

My impression is that in practice, if you want to limit incoming connections then it's better to do that based on (a) some kind of global measure of server load, not just what one socket accept loop can see, (b) by accepting connections and then immediately closing them (b/c otherwise the clients just keep hammering you trying to get in, and the connections you do accept are always stale). But, I haven't actually implemented this or even thought about it that much; I definitely don't know what the code would look like :-). Is this something you do? Can you say more about how you do it?

@smurfix
Copy link
Contributor Author

smurfix commented Mar 21, 2019

If somebody floods the current accept loop, which unconditionally starts each connection in a separate task, I'd need a special handler_nursery that limits the rate of task creation. That's a lot more complicated than modifying an async for loop to sleep for 0.05 seconds. Also, when I do get flooded I might want to start limiting connections to one-per-peer-IP, or (if it gets even worse) to local-networks-only, without incurring the overhead of creating a new task and doing task scheduling for them all.

Another point to consider is that traditional sync code basically has no choice but to do a while-accept-fork loop. The fact that opening a server socket on Trio suddenly gives you a list of listenable socket instead of just one throws a wrench into the "sprinkle async and await onto your code liberally" method of adapting something to work with Trio.

Maybe the handler solution ends up being fine, the aclose_forcefully thing is a good point, and we just need to add a hook that controls whether to further process the just-accepted socket before stating a new task with the handler in it. I could probably live with that.

@njsmith
Copy link
Member

njsmith commented Mar 27, 2019

Another point to consider is that traditional sync code basically has no choice but to do a while-accept-fork loop. The fact that opening a server socket on Trio suddenly gives you a list of listenable socket instead of just one throws a wrench into the "sprinkle async and await onto your code liberally" method of adapting something to work with Trio.

trio.socket isn't going anywhere. If you want to take advantage of the higher-level abstractions, then you'll have to adjust your code some. Obviously we shouldn't do this gratuitously, but it's really unavoidable – if a higher-level abstraction doesn't change how you write your code then it's not actually a higher-level abstraction :-).

If somebody floods the current accept loop, which unconditionally starts each connection in a separate task, I'd need a special handler_nursery that limits the rate of task creation. That's a lot more complicated than modifying an async for loop to sleep for 0.05 seconds. Also, when I do get flooded I might want to start limiting connections to one-per-peer-IP, or (if it gets even worse) to local-networks-only, without incurring the overhead of creating a new task and doing task scheduling for them all.

I have trouble figuring out how seriously Trio should take this. DoS is a real thing and yeah, DoS mitigation is all about rejecting as early and as cheaply as possible. But like... if you have a non-trivial need for DoS mitigation, then you need to go way way way beyond worrying about Trio task spawning overhead.

Anyway, if it turns out to be important, then we can certainly discuss how to add some kind of rate limiting logic to our TCP listener. And if someone is in desperate need and can't wait for a Trio release, then as a temporary measure they can always copy/paste our TCP listener implementation and hack it – it doesn't have any special integration with Trio's internals. And IMO doing rate limiting in a way that's actually useful is probably going to be a pretty complicated topic, so we shouldn't try to guess at how to do it until we have some real-world experience to look at.

So I think this can be closed?

@oremanj oremanj closed this as completed May 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants