Skip to content

Host Health Monitor

Mathew Charles edited this page Jan 17, 2018 · 17 revisions

The Host Health Monitor feature of the Functions Runtime monitors various VM sandbox imposed performance counters. The goal is to temporarily stop the host from doing more work when thresholds for any of the counters are about to be exceeded. This allows the host to avoid hitting hard sandbox limits which could cause a hard shutdown, and also allows the host to gracefully complete in-progress work while waiting for the counters to return to normal limits. The performance counters currently monitored are:

  • Connections : Number of outbound connections (limit is 300).
  • Threads : Number of threads (limit is 512).
  • Processes: Number of child processes (limit is 32).
  • NamedPipes: Number of named pipes (limit is 128).
  • Sections: Number of sections (limit is 256).

Note that the limits above are the hard limits enforced by the sandbox. The actual thresholds used by the monitor are a percentage of these maximums (default is 0.80). When one or more counters are nearing their thresholds, the host will be stopped until the counter values return to normal. The Web App continues to run, but internally the host has been stopped, and no new functions will be run. If the Function App is scaled out to multiple instances, other instances will continue to run and pick up the workload. Once the counter values return to normal, the host will start processing work again automatically. If after waiting for a while the counter values do not recover, the App Domain will be recycled in an attempt to recover.

If your Function App is hitting these thresholds, you'll see errors like "Host thresholds exceeded: [Connections]" being logged, where the brackets will show the set of counters exceeded. If this is happening often, the offending function(s) will need to be examined, to ensure that they're using resources appropriately and are throttled correctly. E.g. is your function code opening up a large/unbounded number of outgoing connections?

The feature is currently only active on Consumption plan, where these sandbox limits exist. The feature is enabled by default, but can be disabled/configured via the healthMonitor section of host.json, e.g.

{
    "healthMonitor": {
        "enabled": true,
        "healthCheckInterval": "00:00:10",
        "healthCheckWindow": "00:02:00",
        "healthCheckThreshold": 6,
        "counterThreshold": 0.80
    }
}

Description of settings:

  • enabled: Whether the feature is enabled. Default is true.
  • healthCheckInterval: The time interval between the periodic background health checks. Default is 10 seconds.
  • healthCheckWindow: A sliding time window used in conjunction with the healthCheckThreshold setting (see below).
  • healthCheckThreshold: Maximum number of times the health check can fail before a host recycle is initiated.
  • counterThreshold: The threshold at which a performance counter will be considered unhealthy. Default is 0.80.

Learn

Azure Functions Basics

Advanced Concepts

Dotnet Functions

Java Functions

Node.js Functions

Python Functions

Host API's

Bindings

V2 Runtime

Contribute

Functions host

Language workers

Get Help

Other

Clone this wiki locally