Metrics and Monitoring

Hystrix captures metrics using the HystrixRollingNumber and HystrixRollingPercentile classes in rolling windows. The rolling windows allow Hystrix to use low-latency moving windows of metrics for circuit breaker health checks and operations.

Direct Access

You can access metrics programmatically with the following calls:

HystrixCommandMetrics.getInstances()
HystrixThreadPoolMetrics.getInstances()

Metrics Event Stream

You can use the hystrix-metrics-event-stream to power the dashboard, real-time alerting, and other such use cases.

Metrics Publisher

You can publish metrics by using an implementation of HystrixMetricsPublisher.

Register your HystrixMetricsPublisher implementations by calling HystrixPlugins.registerMetricsPublisher(HystrixMetricsPublisher impl).

Hystrix includes the following implementations as hystrix-contrib modules:

Netflix Servo: hystrix-servo-metrics-publisher
Yammer Metrics: hystrix-yammer-metrics-publisher

The following sections explain the metrics published with those implementations:

Command Metrics

Each HystrixCommand publishes metrics with the following tags:

Servo Tag: "instance", Value: HystrixCommandKey.name()
Servo Tag: "type", Value: "HystrixCommand"

Informational and Status

Boolean isCircuitBreakerOpen
Number errorPercentage
Number executionSemaphorePermitsInUse
String commandGroup
Number currentTime

Cumulative and Rolling Event Counts

Cumulative counts (Counter) represent the number of events since the start of the application.

Rolling counts (Gauge) are configured by metrics.rollingStats.* properties. They are “point in time” counts representing the last x seconds (for example 10 seconds).

Event	Cumulative Count (Long)	Rolling Count (Number)
`BAD_REQUEST`	`countBadRequests`	`rollingCountBadRequests`
`COLLAPSED`	`countCollapsedRequests`	`rollingCountCollapsedRequests`
`EMIT`	`countEmit`	`rollingCountEmit`
`EXCEPTION_THROWN`	`countExceptionsThrown`	`rollingCountExceptionsThrown`
`FAILURE`	`countFailure`	`rollingCountFailure`
`FALLBACK_EMIT`	`countFallbackEmit`	`rollingCountFallbackEmit`
`FALLBACK_FAILURE`	`countFallbackFailure`	`rollingCountFallbackFailure`
`FALLBACK_REJECTION`	`countFallbackRejection`	`rollingCountFallbackRejection`
`FALLBACK_SUCCESS`	`countFallbackSuccess`	`rollingCountFallbackSuccess`
`RESPONSE_FROM_CACHE`	`countResponsesFromCache`	`rollingCountResponsesFromCache`
`SEMAPHORE_REJECTED`	`countSemaphoreRejected`	`rollingCountSemaphoreRejected`
`SHORT_CIRCUITED`	`countShortCircuited`	`rollingCountShortCircuited`
`SUCCESS`	`countSuccess`	`rollingCountSuccess`
`THREAD_POOL_REJECTED`	`countThreadPoolRejected`	`rollingCountThreadPoolRejected`
`TIMEOUT`	`countTimeout`	`rollingCountTimeout`

Latency Percentiles: HystrixCommand.run() Execution (Gauge)

These metrics represent percentiles of execution times for the HystrixCommand.run() method (on the child thread if using thread isolation).

These are rolling percentiles as configured by metrics.rollingPercentile.* properties.

Number latencyExecute_mean
Number latencyExecute_percentile_5
Number latencyExecute_percentile_25
Number latencyExecute_percentile_50
Number latencyExecute_percentile_75
Number latencyExecute_percentile_90
Number latencyExecute_percentile_99
Number latencyExecute_percentile_995

Latency Percentiles: End-to-End Execution (Gauge)

These metrics represent percentiles of execution times for the end-to-end execution of HystrixCommand.execute() or HystrixCommand.queue() until a response is returned (or is ready to return in case of queue()).

The purpose of this compared with the latencyExecute* percentiles is to measure the cost of thread queuing/scheduling/execution, semaphores, circuit breaker logic, and other aspects of overhead (including metrics capture itself).

These are rolling percentiles as configured by metrics.rollingPercentile.* properties.

Number latencyTotal_mean
Number latencyTotal_percentile_5
Number latencyTotal_percentile_25
Number latencyTotal_percentile_50
Number latencyTotal_percentile_75
Number latencyTotal_percentile_90
Number latencyTotal_percentile_99
Number latencyTotal_percentile_995

Property Values (Informational)

These informational metrics report the actual property values being used by the HystrixCommand. This enables you to see when a dynamic property takes effect and to confirm a property is set as expected.

Number propertyValue_rollingStatisticalWindowInMilliseconds
Number propertyValue_circuitBreakerRequestVolumeThreshold
Number propertyValue_circuitBreakerSleepWindowInMilliseconds
Number propertyValue_circuitBreakerErrorThresholdPercentage
Boolean propertyValue_circuitBreakerForceOpen
Boolean propertyValue_circuitBreakerForceClosed
Number propertyValue_executionIsolationThreadTimeoutInMilliseconds
String propertyValue_executionIsolationStrategy
Boolean propertyValue_metricsRollingPercentileEnabled
Boolean propertyValue_requestCacheEnabled
Boolean propertyValue_requestLogEnabled
Number propertyValue_executionIsolationSemaphoreMaxConcurrentRequests
Number propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests

ThreadPool Metrics

Each HystrixThreadPool publishes metrics with the following tags:

Servo Tag: "instance", Value: HystrixThreadPoolKey.name()
Servo Tag: "type", Value: "HystrixThreadPool"

Informational and Status

String name
Number currentTime

Rolling Counts (Gauge)

Number rollingMaxActiveThreads
Number rollingCountThreadsExecuted

Cumulative Counts (Counter)

Long countThreadsExecuted

ThreadPool State (Gauge)

Number threadActiveCount
Number completedTaskCount
Number largestPoolSize
Number totalTaskCount
Number queueSize

Property Values (Informational)

Number propertyValue_corePoolSize
Number propertyValue_keepAliveTimeInMinutes
Number propertyValue_queueSizeRejectionThreshold
Number propertyValue_maxQueueSize

A Netflix Original Production
Tech Blog | Twitter @NetflixOSS | Twitter @HystrixOSS | Jobs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics and Monitoring

Direct Access

Metrics Event Stream

Metrics Publisher

Command Metrics

Informational and Status

Cumulative and Rolling Event Counts

Latency Percentiles: HystrixCommand.run() Execution (Gauge)

Latency Percentiles: End-to-End Execution (Gauge)

Property Values (Informational)

ThreadPool Metrics

Informational and Status

Rolling Counts (Gauge)

Cumulative Counts (Counter)

ThreadPool State (Gauge)

Property Values (Informational)

Clone this wiki locally