Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HostPerformanceManager: increase max Connections (Host thresholds exceeded: [Connections]) #2085

Closed
brettsam opened this issue Nov 1, 2017 · 9 comments

Comments

@brettsam
Copy link
Member

brettsam commented Nov 1, 2017

We set the maximum connections limit (for things like HttpClient) to be 250 in the dynamic sku -- https://github.com/Azure/azure-webjobs-sdk-script/blob/48112e186c999e05e10d21892b2b702108487a11/src/WebJobs.Script/ScriptConstants.cs#L73

But the HostPerformanceManager will shut down the host if they reach 80% of the 300 limit. 250/300 = 83%.

I'd recommend we make this limit higher -- 99%? That'll still provide the nice messaging that makes it more obvious what's happening while allowing as many connections as possible.

@brettsam
Copy link
Member Author

brettsam commented Nov 2, 2017

@mathewc / @fabiocav / @paulbatum

We need to make a change here -- either to the DynamicSkuConnectionLimit or to the HostPerfManager limit.

To recap: this 250 limit was dropped down from Int32.MaxValue (the default) a while ago to restrict people hitting the connection limit with highly concurrent functions. The setting gets applied to ServicePointManager.DefaultConnectionLimit, which applies the max limit for connections to a single host.

Using HttpClient as an example, if you create a single HttpClient and issue 100 concurrent requests to the same host (but not necessarily the same full URL), you'll end up with 100 connections, all created at the same time. If you lower this DefaultConnectionLimit to 20, you'll end up with 20 connections and any request that you make when all connections are in-use will be queued up to wait until one is available. It slows down throughput by limiting the number of connections.

To build on that example, if you use a single HttpClient and make 100 concurrent requests to two different hosts (so 200 requests total), you'll end up with 200 connections. If you set DefaultConnectionLimit to 20, you'll end up with 40 -- b/c the limit applies per-host.

So, by lowering the limit to 250 we allowed one class of function -- those with many connections to a single host -- to run fine and maintain decent throughput. But another class -- one with many connections to many hosts -- will hit our connection limit. This is still a better place than we were in before this limit -- when both classes were in trouble.

All that being said -- the new HostPerformanceManager effectively lowers the ceiling for apps hitting this 250 limit. @paulbatum and I just chatted about this and he's convinced me that lowering this connection limit down even further may be our best bet here. It would serve both classes of functions and even if throughput is slowed down, we should be able to detect that and scale out to another machine anwyay.

The questions -- do you guys agree? And if so, where do we put that number? Paul threw out 50 as a starting point.

@brettsam brettsam changed the title HostPerformanceManager -- increase max Connections HostPerformanceManager: increase max Connections (Host thresholds exceeded: [Connections]) Nov 2, 2017
@brettsam
Copy link
Member Author

brettsam commented Nov 2, 2017

Note that a workaround is to turn off the host performance monitor in the host.json with "hostHealthMonitorEnabled" : false

@mathewc
Copy link
Member

mathewc commented Nov 6, 2017

Agree with lowering this to 50 in dynamic.

@WonderPanda
Copy link

WonderPanda commented Nov 8, 2017

@brettsam @mathewc Can you guys elaborate a little bit more on how we're supposed to handle this Connection issue from the perspective of designing our function apps? I've just recently started hitting the message for Host thresholds exceeded: [Connections] (I'm assuming it's since these new changes have been going live) and when it happens my Function app crashes and starts returning 503 service unavailable for a while until everything catches up and it's presumably restarted. Is there anything I can do on my end to ensure that there are no service interruptions for my users?

Disabling the Host Health Monitor seems to fix the issue for me for now. It would be nice to get some feedback on this though because I'm concerned that I'm missing out on other good stuff by having it turned off.

@christopheranderson christopheranderson added this to the Triaged milestone Jan 16, 2018
@christopheranderson
Copy link
Contributor

@brettsam - what's the status on this?

@brettsam
Copy link
Member Author

It's been merged with #2106. Closing.

@georgiosd
Copy link

Should this change be reflected here? https://github.com/Azure/azure-functions-host/wiki/host.json

I presume it's the http:maxConcurrentRequests setting

@brettsam
Copy link
Member Author

This is for the .NET Framework's ServicePointManager.DefaultConnectionLimit, which controls outgoing HTTP connections for you. This means connections that your function is making to other REST APIs, storage, etc. We automatically set this to 50 in Consumption plans (it's the default value -- Int32.MaxValue -- in App Service Plans) and there's no host.json configuration available. It's a pretty confusing API so we're handling the setting of this one for you. I wrote up a little more detail in the Web Jobs wiki here: https://github.com/Azure/azure-webjobs-sdk/wiki/ServicePointManager-settings-for-WebJobs

@georgiosd
Copy link

Thank you

@ghost ghost locked as resolved and limited conversation to collaborators Jan 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants