-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client/server networking abstractions #10308
Comments
Thanks! @davidfowl and I discussed this. We think If we go the abort route, we'll probably need something similar to |
Great stuff!
Perhaps it makes sense to add a CancellationToken to these methods.
Maybe this can be moved to the server implementation. It simplifies the IConnectionListener implementation because it no longer needs to track the ConnectionContexts. |
Sounds promising (the goal; I haven't studied the specific API in depth), and it'll be a pleasure to be able to mark large chunks of Pipelines.Sockets.Unofficial as |
AsyncSocketPower .NET ? |
TL;DR, this design basically introduces a transport abstraction and makes it a bit more decoupled so it can be used standalone (without kestrel or the pipeline). Kestrel and SignalR become a consumer of this new API (like a pluggable TcpListener and TcpClient). I see a couple of options around naming:
Depends on how bad people feel about the current naming (I know how important it is) but the big picture here is that we want to create an ecosystem around these things, of middleware, servers, parsers ETC. I don't think the abstractions belongs in corefx because this is a tiny layer above that with more opinions. If I had to draw a comparison, I would say this is the ASP.NET Core but for the communication layer below not tied to a specific transport. cc @DamianEdwards if he has any ideas |
maybe it is neither aspnet nor corefx? corefx <=== (some comms naming thing) <=== aspnet but: I appreciate that this may be impossible at this point. It is just... a bit messy. But coupling too tightly to aspnet also has impact re transitive dependency graph explosion, etc. |
Updated the proposal based on the prototype:
|
Either way, this suggests that Currently EDIT: Thinking from Orleans perspective, we would need some internal tracking for both incoming & outgoing connections anyway, for when a node is evicted (declared dead) and we must forcibly terminate any connections with it. |
It's possible. It is probably simpler to keep them around, and then use them again when needed. |
I do want the LibuvThreads to get cleaned up before IConnectionListener.DisposeAsync/StopAsync completes. I think we can do this through a combination of reference counting the active connections and closing the listen socket. |
I played with shutdown for a little and it seems like the simplest thing to do would be to implement graceful shutdown in the transports. public interface IConnectionListener : IAsyncDisposable
{
EndPoint EndPoint { get; }
ValueTask<ConnectionContext> AcceptAsync();
ValueTask StopAsync(CancellationToken cancellationToken = default);
} The token is the signal for how long StopAsync waits before to tearing all of the connections and state down. This makes it so that it's easy to do graceful shutdown without a server like kestrel in the middle but you need to call StopAsync with a token do do graceful shutdown of a specified timeout. DisposeAsync could pick a default timeout as well (based on transport specifics) |
An alternative design is to pass a CancellationToken to AcceptAsync, this would be the signal to stop accepting new connections and graceful shutdown would be up to the caller. I think we should try to push that logic into the transports since it's so common. |
Multiple listeners should be able to use the same LibuvThreads.
The complexity is the same whether it is implemented in the transport or the server: track the connections and dispose them.
My feeling is this is making the transport more complex. It needs to track connections per listener. If this is in the server, it can be common for different transports. |
There’ll be an order of magnitude less transport implementors than consumers of the API. Ideally it should be possible to use the transport APIs directly and still implement graceful shutdown somewhat trivially. That means, gracefully waiting for connections to close (for some threshold), then aborting them if that fails. Do we want that to be in everyone’s logic instead of providing helpers that transports can use to implement it? |
Wanting to provide this as a re-usable piece of logic doesn't mean it has to be part of the listener. It seems better separation of responsibility. The listener stays close to a listen socket. Tracking and disposing connections is responsibility of component calling Accept. There is no functional reason the listener needs to take this responsibility. |
While that's completely reasonable, it makes the listener extremely hard to use standalone. In order to properly shutdown, the owner of the accept loop has to know when all of the accepted connections have completed before calling DisposeAsync on the listener. Remember our transports currently manages their own memory and it would be nice if the caller didn't have to do that to avoid ugly unexpected errors (or worse, use after free bugs!) I'd like this to be possible: The alternative is to expose a connection manager API that lives somewhere and needs to be used in conjunction with a listener in order to use it properly. |
In systems containing both clients and servers (or nodes which are both, eg Orleans silos), there's going to be some kind of connection tracking anyway. Eg, Orleans actively closes connections to nodes which are declared dead (voted out by the cluster because they are not functioning). In that case the listener continues operating but some subset of connections are terminated. Clients and servers both do this, so both need some form of connection tracking. The differences between incoming & outgoing connections largely disappear once the connection is established, so they should be treated similarly (if not identically.) Generally it's better for the process to terminate abruptly, ungracefully killing open connections, rather than to hang indefinitely because some connection remains open because the developer messed up their connection tracking. This is one of the reasons that I prefer the listener does not block until connections all close. Maybe this potential pitfall is minor in comparison to the potential simplicity gained by having additional connection tracking in the listener's contract (and presumably the connection factory's contract?) |
There should be a convenient way to close the listener in a way that tells it to stop accepting new connections and at least tries to wait for all the already-accepted connections to be closed gracefully (perhaps with some timeout). This wouldn't only mean waiting for the sockets to close, but it would also mean waiting for DisposeAsync to be called on all the accepted ConnectionContexts as a means to ensure the app is done with all the pipes and therefore also the memory pool. Otherwise, like @davidfowl mentions, we're opening app developers up to potentially very subtle use-after-free bugs. I think it would also be nice to have a way to tell the lister to immediately abort all of its open connections/sockets. I think even this API should attempt to wait a small period of time for all the app code to react to the aborted connections and dispose all the accepted ConnectionContexts once again in order to try to avoid use-after-free issues. This API should throw if the timeout is exceeded to indicate that the memory pool isn't safe to use. Technically, if the app developer tracks all the ConnectionContexts themselves, they could individually abort each connection. So this second API is strictly necessary, but nice to have. |
Do you want to make the lifetime of the memorypool match the lifetime of the listener? |
Yes.
Not sure what you mean here. Listeners don't have outbound connections. You might be referring to the PR where we I currently implement the In theory, you can ref count connections to avoid disposing the memory pool too early (but then why not implement the entire connection tracking to simplify the logic). I think we want to be in a place where this logic doesn't need to be done by consumers of the API since that makes it really hard to use. |
IConnectionFactory connections are not tied to a listener lifetime.
Another possibility is that the memory pool stays around (and maybe reduces size based on memory pressure). |
See what I posted, the listener and IConnectionFactory are decoupled. They happened to be implemented on the same PR but it's a draft so that's completely unrelated. Pretend they are different objects, the IConnectionFactory could
Yeah but that's complicated and we don't need to complicated things here. Reference counting would be a much easier solution to implement (track connection ConnectAsync/DisposeAsync). Though, we didn't make the |
After considering this more, I do not think making the listener's dispose method wait for all its accepted connections to be disposed first is really necessary to avoid use-after-free bugs. After all, whoever rents memory from the pool has to explicitly return or dereference the memory for the memory to go back to the pool. However, I do think this would help developers avoid ODEs that could arise from disposing the pool after immediately disposing the listener. If developers see that the listener's dispose isn't completing, that should be a sign that they haven't cleaned everything up. Having the listener's dispose throw an exception after a certain timeout could make it even more clear to the developers what's going wrong, but I'm concerned that such a timeout wouldn't work well if you're trying to implement IServer.StopAsync using an IConnectionListener. The problem is the CancellationToken passed into StopAsync can stay uncanceled for arbitrarily long periods of time, and until that token is canceled we don't want to start "ungraceful" shutdown and dispose accepted connections. Having IConnectionListener.DisposeAsync() throw before the token is canceled wouldn't be good. Ultimately, I think we might need two different methods for gracefully vs. ungracefully closing the listener. DisposeAsync() could be the ungraceful method, and we could add a new graceful method we'll call StopAcceptingNewConnections() for now. Developers who want to support graceful shutdown could call StopAcceptingNewConnections() and wait until they close all of their already-accepted connections before calling DisposeAsync(). Why do we want a StopAcceptingNewConnections() method when you can simply stop calling AcceptAsync()? The main reason for this is that we want to close the listen socket ASAP so other servers can start listening on the endpoint and the listen backlog doesn't fill up with connections the server will never accept. Alternatively, instead of adding a StopAcceptingNewConnections() method, we might choose to add a StopAsync(CancellationToken) method like the one on IServer. Shrinking the memory pool is an idea we've been considering for years now, but it's tough to come up with a shrinking strategy that's ideal for all use cases. At a certain point, you might as well use unpooled arrays and let the GC shrink the heap for you. |
Thank you for clarifying. The main thing I realized now is how much the example is influenced by the scoping of resources (memory pool/Libuv threads) to the IConnectionListener. |
It would require reference counting. We could look at making the factory disposable as well in order to give more flexibility to implementations.
There’s no feature to do that, with or without listeners. |
There is a feature request: https://github.com/dotnet/corefx/issues/29022 😄 |
Is Multicast being considered in the design for these abstractions? If not, is there a plan for improving multicast support? |
No this is purely point to point connection oriented protocols. Nothing is being doing to address connection less scenarios as yet. |
…#10321) This is a massive set of changes to Kestrel to remove the existing pubternal transport layer and implement a public facing API for listeners and clients, see the details here #10308. This change only has the server side pieces of the story as I don't want to add the client APIs without having ported SignalR to use them. Here are the highlights: - Transport.Abstractions is empty (will be removed in a separate PR as it requires removing it from a ton of places) - TransportConnection has been moved to Connection.Abstractions (we can decide if we need to consolidate with DefaultConnectionContext in a later PR) - Added FileHandleEndPoint which allows binding to a file handle (could be a pipe or tcp handle) ListenOptions has been gutted for most pubternal API and returns various types of binding information . The source of truth is the EndPoint instance. - Cleaned up a bunch of libuv tests decoupling them from Kestrel.Core ## Breaking changes - Removing pubternal API is itself a breaking change but one that we already planned to do. - We've removed the ability to set the scheduling mode on Kestrel - DisposeAsync was added to ConnectionContext - NoDelay was removed from ListenOptions. This has been moved to each of the transports. One major difference though is that it's no longer localized per endpoint but is global. We'd need a derived EndPoint type (or maybe extend IPEndPoint) to store both the socket options and the binding information.
Keeping this open because only the server side has been implemented. |
How should we handle the client side implementations? SignalR & Orleans are both good candidates. I will update Orleans to use the latest version of the server bits & make similar changes to the client bits. A common-sounding namespace may help adoption (eg Microsoft.Extensions.Connections or System.Net) |
I planned to do SignalR next. You can totally do orleans with the current APIs.
That would be good!
Agreed but that would be a mega breaking change. Not impossible but we need to discuss it as it's very late in the cycle. |
Pretty sure I've brought that up every cycle :) But: if you need all-hands, let me know and I can play with things like SE-Redis (which has mostly client, but also a toy server). |
The naming is a super hard break, we can’t just do the client, the server needs to be broken as well. This means we need to break IFeatureCollection as well (or just deal with that dependency?) |
@davidfowl Is the server work for this done? Is something separate tracking the SignalR work? If so, I'd suggest closing this since there's no more actionable work for servers here. |
Yes the server work is done. Can we instead turn this into an EPIC issue and link to other bugs? |
I guess? Seems overkill. |
I'll create a new epic. The GitHub events will look weird if I edit @ReubenBond 's text to start linking in other issues. |
Fine to edit my issue as you see fit - either way 🙂 |
Closing this as it's tracking the sever work. The only remaining work here might be a rename. |
Very late feedback:
|
Epic #10869
The goal of this issue is to specify abstractions which the .NET community can use to build robust, correct, high-performance client & server applications. This is a follow-up to #4772 (Project Bedrock) and is largely a summary of a discussion with @davidfowl, @anurse, @halter73, @shirhatti, & @jkotalik.
Initiating & accepting connections
Clients initiate connections to an endpoint using
IConnectionFactory
. Servers bind to an endpoint usingIConnectionListenerFactory
and then accept incoming connections using theIConnectionListener
returned at binding time.A framework might want to support multiple different transports, such as file + socket. In that case, it can use multiple
IConnectionListenerFactory
implementations.The shared abstractions should not dictate how these factories are are specified or configured: frameworks should continue to follow a framework-specific builder pattern. Eg,
KestrelServerOptions
hasList<ListenOptions>
where each entry is a different endpoint to listen on + middleware to handle connections to that endpoint.Both
IConnectionListenerFactory.BindAsync
&IConnectionFactory.ConnectAsync
accept aSystem.Net.EndPoint
to bind/connect to. The BCL hasEndPoint
implementations forIPEndPoint
,DnsEndPoint
, andUnixDomainSocketEndPoint
. Users can add new transports which accept custom sub-classes ofEndPoint
, egMyOverlayNetworkEndPoint
.Servers bind to endpoints using
IConnectionListenerFactory
, resulting in aIConnectionListener
. The listener'sAcceptAsync()
is then called in a loop, each time resulting in aConnectionContext
instance which represent a newly accepted connection. When the server wants to stop accepting new connections, the listener is disposed by callinglistener.DisposeAsync()
. Connections which were previously accepted continue to operate until each is individually terminated.For implementations of
IConnectionListener
which are backed by a callback system (eg, libuv), decoupling the lifetime ofIConnectionListener
and the (many) resultingConnectionContext
s poses a challenge. We need to consider whether or not this minimal interface allows graceful shutdown (including draining connections) to still be cleanly implemented in those cases.Handling connections
The client/server decides how it handles the
ConnectionContext
. Typically it will be decorated (eg, viaConnectionDelegate
) and pumped until the connection is terminated. Connections can be terminated by callingDisposeAsync()
(fromIAsyncDisposable
) but can also be terminated out-of-band (eg, because the remote endpoint terminated it). In either case, clients/servers can subscribe to theConnectionContext
'sIConnectionLifetimeFeature
feature to learn about termination and take necessary action (eg, cleanup).Connection metadata
Currently,
IHttpConnectionFeature
is used to expose local/remote endpoint information. We will add a new feature,IConnectionEndPointFeature
as a uniform way to access this information.The text was updated successfully, but these errors were encountered: