Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce metrics for ThreadPoolExecutors #947

Closed
batsatt opened this issue Aug 23, 2019 · 5 comments
Closed

Produce metrics for ThreadPoolExecutors #947

batsatt opened this issue Aug 23, 2019 · 5 comments
Assignees
Labels
enhancement New feature or request MP P3 SE

Comments

@batsatt
Copy link
Contributor

batsatt commented Aug 23, 2019

At minimum, create metrics for pool sizes and task counts, maybe active threads (this is more expensive). Additional metrics can be created for known subtypes.

Executors should not be modified, rather metrics should be pulled from them so that any type can be supported. This implies some form of registration (e.g. at the point of installation via a builder, etc.).

@batsatt batsatt added enhancement New feature or request MP SE labels Aug 23, 2019
@spericas
Copy link
Member

Stating the obvious likely, but we should do this after the 2.0 work is merged.

@m0mus m0mus added P3 P4 and removed P3 labels Aug 29, 2019
@tomas-langer tomas-langer removed the P4 label Jun 24, 2021
@tomas-langer
Copy link
Member

Requested recently by a customer through a different other channel, let's retriage

@gmpatter
Copy link

We are seeing in our app, when load increases on an instance, the requests are put on the thread pool queue. When the thread pool queue reaches full capacity then the app responds with a 503. Which is all expected behaviour. But when we deploy on Kubernetes this can result in k8s restarting the pod because it is returning 503 for the health check.
It might be useful if we had a metric for the queue size so that we can alert when the queue is growing, and possibly also use the queue size metric in our scaling decisions.

@tjquinno
Copy link
Member

@gmpatter (and others)

Helidon 2.x already has an optional feature for enabling some additional key performance indicator metrics. For some reason the documentation for this is missing from our published doc site, but below I've pasted the details from our doc source.

Note that the KPI deferred Meter would capture queued requests.

Key Performance Indicator (KPI) Metrics

Any time you include the Helidon metrics module in your application, Helidon tracks two basic performance indicator metrics:

  • a Counter of all requests received (requests.count), and
  • a Meter of all requests received (requests.meter).

Helidon also includes additional, extended KPI metrics which are disabled by default:

  • current number of requests in-flight - a ConcurrentGauge (requests.inFlight) of requests currently being processed
  • long-running requests - a Meter (requests.longRunning) measuring the rate at which Helidon processes requests which take at least a given amount of time to complete; configurable, defaults to 10000 milliseconds (10 seconds)
  • load - a Meter (requests.load) measuring the rate at which requests are worked on (as opposed to received)
  • deferred - a Meter (requests.deferred) measuring the rate at which a request's processing is delayed after Helidon receives the request

You can enable and control these metrics using configuration:

metrics.key-performance-indicators.extended = true
metrics.key-performance-indicators.long-running.threshold-ms = 2000

Further, for SE apps:

Your Helidon SE application can also control the behavior of the KPI metrics programmatically.

  • Prepare a KeyPerformanceIndicatorSettings object, using its builder, and then pass the builder when invoking the MetricsSupport.Builder#keyPerformanceIndicatorMetricsSettings() method, or

  • Prepare a Config object and pass it to the MetricsSupport.Builder#keyPerformanceIndicatorMetricsConfig() method.

    extended = true
    long-running.threshold-ms = 2000
    

@tjquinno
Copy link
Member

tjquinno commented Jul 8, 2021

Possibly nearly equivalent to #2688 and #2689 (as described in the earlier comment).

While it's true that the recently-added KPI metrics do not directly report information about the executor used to queue and run requests, the executor behavior can be inferred from the KPI metrics.

Is it worthwhile to invest the work to add the nearly-equivalent executor-based metrics?

Would they add sufficient actionable information over the existing KPI metrics?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request MP P3 SE
Projects
Archived in project
Development

No branches or pull requests

6 participants