Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limit for Kestrel - Design mechanism to apply back pressure to accepting connections #13295

Open
mconnew opened this issue Aug 20, 2019 · 22 comments
Labels
affected-very-few This issue impacts very few customers area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions 🥌 Bedrock blocked The work on this issue is blocked due to some dependency enhancement This issue represents an ask for new feature or an enhancement to an existing one feature-rate-limit Work related to use of rate limit primitives feature-yarp This issue is related to work on yarp Needs: Design This issue requires design work before implementating. Priority:1 Work that is critical for the release, but we could probably ship without severity-nice-to-have This label is used by an internal tool
Milestone

Comments

@mconnew
Copy link
Member

mconnew commented Aug 20, 2019

Is your feature request related to a problem? Please describe.

To match functionality with WCF on .NET Framework, CoreWCF needs to be able to control the number of pending accepts. When a client connects to WCF using the NetTcp transport, there is a handshake that needs to be completed which includes authentication and security upgrade negotiation. After a connection has been established and while the handshake is ongoing, a connection is considered to be a pending connection. WCF controls the maximum number of connections which are in the pending connection state by controlling the maximum number of pending accepts to never be more than how many more pending connections are possible. For example, if the maximum number of pending connections is configured to be 100 and there are currently 95 pending connections, we limit the number of pending accepts to be no more than 5.

Describe the solution you'd like

I believe there should be a generic throttling interface that can be added by DI for use at any layer within an app, and a derived interface (which adds no extra methods) which is used at the transport layer. The two interfaces would be defined as:

public interface IThrottle
{
    ValueTask<bool> AcquireAsync(CancellationToken cancellationToken = default(CancellationToken));
    void Release();
}

public interface ITransportThrottle : IThrottle { }

When calling AcquireAsync, one of four things can happen.

  1. The returned ValueTask is completed with a value of true meaning the throttle was acquired.
  2. The returned ValueTask is not completed and the call goes async. When a throttle can be acquired, the ValueTask is completed with a value of true meaning the throttle was acquired.
  3. The returned ValueTask is completed with a value of false meaning the throttle can not be acquired immediately. This would be used for a throttle implementation which is used when you want to fail immediately such as returning a 503 throttled HTTP response.
  4. The returned ValueTak is not completed and the call goes async. If the cancellation token is cancelled, the ValueTask is completed with a value of false meaning the throttle was not acquired. This would be used for scenarios where waiting for a throttle can time out or when shutting down the application and you need to complete the pending AcquireAsync call for cleanup.

In the code which calls Accept on the listening socket, if an ITransportThrottleexists in DI, in a loop it acquires the throttle and adds another pending accept until ITransportThrottle.AcquireAsync returns a non-completed ValueTask. It would be up to the implementation of the ITransportThrottleto ensure there's a sensible number of pending accepts. Basically the number of pending accepts is completely controlled by the ITransportThrottleif it exists.

Edit: Changed IServiceThrottle to ITransportThrottle

@tmds
Copy link
Member

tmds commented Sep 2, 2019

👍

Note that backpressure exists at the OS level. This is the parameter that is passed to the Socket.Listen call.

I wonder if it makes sense to make IThrottle explicit part of the bedrock abstractions. For example, a parameter on the IConnectionListener.Bind.

CC @davidfowl @ReubenBond @halter73

@davidfowl
Copy link
Member

I see the value in this existing but I don't think it belongs on any of the existing APIs.

@tmds
Copy link
Member

tmds commented Sep 3, 2019

Is the IConnectionListener meant to call AquireAsync or is it some other component?

@davidfowl
Copy link
Member

I can't see how that can be enforced at this point. I would assume it's the caller.

@mconnew
Copy link
Member Author

mconnew commented Sep 3, 2019

@tmds, the backlog argument to Socket.Listen isn't back pressure. That's a buffer to accommodate bursts of incoming connections that the app isn't calling Accept fast enough to handle. If you have 0 listen backlog configured, then the OS will deny any incoming connections if there isn't an available pending Accept. That isn't what I need solved. When you are CPU contended, the OS gives each thread a quanta of time to execute before preemptively switching out to another thread to execute. During the time that a thread has the CPU, it can issue multiple asynchronous Accept calls on a listening socket before it yields the thread. Those accepts complete on IO Threads which have a different scheduling priority and the preemptive behavior is very different in the OS. So basically you can have a CPU pegged at 100% and "worker" threads will be preempted by IO Threads which handle the completed Accept and (hopefully) dispatch calling the handler on a worker thread. You should only be calling Accept on an IO thread (because Windows will cancel the pending IO if you close the thread so calling Accept on a worker thread causes the CLR to keep more threads alive in the thread pool so as not to cancel pending IO). So unless you have a mechanism to prevent new pending Accept calls from being made, you can get into the situation where the OS scheduling of IO threads results in you adding more work to your worker thread pool and you end up in the situation where you reduce performance because of more context switches, and any one request takes longer which can result in client timeouts which results in resubmitting of work and you can get into a death spiral.

@davidfowl, I'm not suggesting adding this to an existing API. I'm proposing a new interface API for throttling and suggesting a way how the connection listener (I presume the IConnectionListener that @tmds mentioned) can use that API to allow an application to prevent the death spiral that you can get into without it. This is a very real scenario and not just academic as I have worked with customers who have had this happen when they have set their WCF transport limits too liberally.

@davidfowl
Copy link
Member

@davidfowl, I'm not suggesting adding this to an existing API. I'm proposing a new interface API for throttling and suggesting a way how the connection listener (I presume the IConnectionListener that @tmds mentioned) can use that API to allow an application to prevent the death spiral that you can get into without it.

Sounds like an implementation detail.

@tmds
Copy link
Member

tmds commented Sep 3, 2019

@tmds, the backlog argument to Socket.Listen isn't back pressure. That's a buffer to accommodate bursts of incoming connections that the app isn't calling Accept fast enough to handle.

In both cases it's implementing a limited queue. I'm not sure when it fits the name backpressure and when not.

For example, if the maximum number of pending connections is configured to be 100 and there are currently 95 pending connections, we limit the number of pending accepts to be no more than 5.

How do you prevent this being used to do some attack in the form: make a nr of connections that perform a handshake in the slowest way possible? This stops real clients from making connections.

@mconnew
Copy link
Member Author

mconnew commented Sep 3, 2019

We have a configurable timeout which limits how long a connection is allowed to take to perform the handshake. net.tcp isn't an internet protocol, it's a LAN or WAN protocol so you are running inside a trusted network. If you have a malicious host in your LAN, it's already game over as it can do things like MAC address spoofing, mess with BGP routing, mess with DHCP to do MITM attacks, perform DNS poisoning, passively snoop on the network to pick up any unencrypted data and 1001 other ways to screw with your network.

If you must expose a web service to a hostile environment, you generally run it behind a LB or firewall which does things like limit the number of concurrent requests from a single IP and other DoS attack mechanisms. At the end of the day, if a web service is exposed to malicious clients, the only way to prevent DoS attacks is to run at scale and/or have a smart front-end like a LB which is specialized in detecting attack patterns and preventing malicious clients from getting through to the server. But these generally aren't the problem domain of WCF's net.tcp protocol.

WCF server in .NET Framework has customers with significantly more than 150,000 clients connected to a single server. When that WCF service needs to be restarted, e.g. with a software update, if you accept connections as fast as you can, you would have so many connection handshakes in flight at once that each handshake would exceed it's timeout. When the service sends some bytes as part of the handshake and then calls ReceiveAsync, the amount of time it would take for the await continuation to execute due to the amount of contention would exceed your timeout.

Even if CPU/thread contention wasn't a limitation (you are running on a 128 core system), the security process in windows (lsass) is single threaded and limited by the speed of a single core. So the security handshake bottlenecks on the single core which queues the authorization requests and when running at 100% CPU will result in exceeding the connection timeout. So the connection is killed, most likely by the client first as the timeout firing also requires a thread to run on, and the client attempts to reconnect again. So now you have the original connection attempt still in the lsass queue, which has been abandoned by the client but there's no way to cancel a pending authorization, and you throw another one into the queue. Rinse and repeat.

If you are CPU/thread contended due to too many concurrent connection handshakes, any clients which have been able to connect, their actual service requests are competing for CPU time with the connecting clients, so their requests time out. Which causes the channel to be faulted, which means the connection needs to be re-established and that client joins the party.

If you have control of the number of pending accepts, you can limit the number of concurrent handshakes which means you can guarantee (by tuning your service) you are not overly contended on completing those handshakes. Sure you will have clients being denied connecting to the server socket, but nobody in the listen backlog will timeout as you configure things high enough that you can accept new requests at a fast sustainable rate and you size your listen backlog to that no connection stays in the listen backlog for more than a few seconds. But if you If you don't have a way to control how many outstanding accepts you have, if you have enough clients connecting at once you can get in a death spiral that makes itself worse.

I know I could asynchronously wait after a socket has been accepted and not start the handshake, but then you are allocating resources for each incoming connection just to wait there. With the rate of allocation with a large number of incoming connection, you would be running a Gen0 GC quite regularly so all those waiting connections are likely to have their allocations get to at least Gen1 before the connection times out. So now you are churning the GC heavily by having them waiting there. You also have a lot of timers registered for expiring the wait before starting the handshake. You also run the risk of a connection making it to the handshake with very little time left on it's timer and being aborted when part way through the handshake as the timer starts when you accept a connection. It's better to distribute the timers needed on the client with a retry backoff algorithm than have timer contention on the server.

Overall, without this capability, there are some existing large scale WCF workloads that an AspNetCore based WCF solution just wouldn't be able to handle without Accept throttling. WCF is used a lot in enterprises so we can have the start of the workday connection spike where everyone sits down at their desk at about the same time and logs into an app all at once.

@tmds
Copy link
Member

tmds commented Sep 4, 2019

@mconnew thanks for the interesting and detailed response. So this is to maintain performance with high nr of concurrent accepts, and not for security.

@mconnew
Copy link
Member Author

mconnew commented Sep 5, 2019

@tmds, correct, this is about scalability. I can't think of a scenario where this would affect regular HTTP but it could affect WebSockets or gRPC. Especially if using something like token based authentication.

@davidfowl
Copy link
Member

I'm going to look into a design for this. It comes up because the ConnectionDelegate is a push model and the server is accepting connections and dispatching.

@tmds
Copy link
Member

tmds commented Oct 3, 2019

@davidfowl at what point do you consider a connection handled enough to allow a new accept?

@davidfowl
Copy link
Member

That’s policy that can be implemented

@tmds
Copy link
Member

tmds commented Oct 3, 2019

From discussing with @mconnew I think policies will be app specific, or would there be some that come with AspNetCore?

@analogrelay analogrelay added cost: M Needs: Design This issue requires design work before implementating. and removed cost: M labels Mar 18, 2020
@analogrelay analogrelay assigned halter73 and unassigned davidfowl Mar 18, 2020
@davidfowl
Copy link
Member

davidfowl commented Mar 26, 2020

We might have a clean design for this and it looks like an IConnectionListener implementation that wraps the transport is the way to go here. It would look like any other connection middleware:

options.ListenAnyIP(5000, listenOptions =>
{
    listenOptions.UseConnectionThrottling(); // This is the throttling middleware
    listenOptions.UseHttps(); // This is the existing HTTPS middleware
});

This works by exposing a new method on ListenOptions:

public class ListenOptions
{
    public ListenOptions UseListenerFilter(Func<IConnectionListener, IConnectionListener> middleware);
}

This would allow you to wrap the previous IConnectionListener in order to handle the calls to accept. That's where this logic can be plugged in.

@halter73 spiked something very similar today (for different reasons) but I think this approach has legs and doesn't introduce new concepts.

@BrennanConroy BrennanConroy added the blocked The work on this issue is blocked due to some dependency label Jan 15, 2021
@halter73 halter73 removed their assignment Jul 14, 2021
@ghost
Copy link

ghost commented Jul 14, 2021

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

@rafikiassumani-msft rafikiassumani-msft changed the title Support System.Net.Connections.ConnectionListenerFactory-based transports and middleware (mechanism to apply back pressure to accepting connections) Rate Limit for Kestrel - Design mechanism to apply back pressure to accepting connections Jan 6, 2022
@rafikiassumani-msft rafikiassumani-msft added the feature-rate-limit Work related to use of rate limit primitives label Jan 6, 2022
@rafikiassumani-msft rafikiassumani-msft modified the milestones: Backlog, .NET 7.0 Jan 6, 2022
@Tratcher Tratcher self-assigned this Jan 7, 2022
@rafikiassumani-msft rafikiassumani-msft added the Priority:1 Work that is critical for the release, but we could probably ship without label Jan 13, 2022
@rafikiassumani-msft rafikiassumani-msft modified the milestones: .NET 7.0, .NET 7 Planning Jan 25, 2022
@ghost
Copy link

ghost commented Jan 25, 2022

Thanks for contacting us.

We're moving this issue to the .NET 7 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@Tratcher Tratcher removed their assignment Oct 18, 2022
@amcasey amcasey added area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 2, 2023
@amcasey amcasey modified the milestones: .NET 8 Planning, Backlog Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected-very-few This issue impacts very few customers area-middleware Includes: URL rewrite, redirect, response cache/compression, session, and other general middlesware area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions 🥌 Bedrock blocked The work on this issue is blocked due to some dependency enhancement This issue represents an ask for new feature or an enhancement to an existing one feature-rate-limit Work related to use of rate limit primitives feature-yarp This issue is related to work on yarp Needs: Design This issue requires design work before implementating. Priority:1 Work that is critical for the release, but we could probably ship without severity-nice-to-have This label is used by an internal tool
Projects
None yet
Development

No branches or pull requests