CONVENTIONS: Update CPU query sum_irate #988

wking · 2021-12-15T20:23:32Z

Catching up with kubernetes-monitoring/kubernetes-mixin#619, which landed in OpenShift 4.9 and later here.

Catching up with [1], which landed in OpenShift 4.9 and later via [2]. [1]: kubernetes-monitoring/kubernetes-mixin#619 [2]: https://github.com/openshift/cluster-monitoring-operator/pull/1214/files#diff-3125af8c4a74a5a372c15a821e3c53b7f5710c3ebd5af1fb05f4d7294e2f1afdL529

wking · 2021-12-15T20:44:57Z

CONVENTIONS.md

 # CPU usage of each container in the openshift-monitoring namespace
-max by (pod, container) (node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace="openshift-monitoring"})
+max by (pod, container) (node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace="openshift-monitoring"})


More broadly, trying to mix openshift-monitoring and openshift-sdn results doesn't make sense to me. Perhaps this was intended to be commented out as an example of changing namespaces and dropping over-time aggregation? I'd expect something like:

sort_desc( # Calculate the 90th percentile of CPU usage over the past hour and add 10% to that 1.1 * (max by (pod, container) ( quantile_over_time(0.9, node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=~"openshift-.*", container != "POD", container!=""}[60m])) ) / # Calculate the maximum requested CPU per pod and container max by (pod, container) (kube_pod_container_resource_requests{namespace=~"openshift-.*", resource="cpu", container!="", container!="POD"}) )

Or, if folks don't want to weight for bursts, dropping to avg_over_time:

sort_desc( # Calculate the average CPU usage over the past hour (avg by (pod, container) ( avg_over_time(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=~"openshift-.*", container != "POD", container!=""}[60m])) ) / # Calculate the maximum requested CPU per pod and container max by (pod, container) (kube_pod_container_resource_requests{namespace=~"openshift-.*", resource="cpu", container!="", container!="POD"}) )

openshift-bot · 2022-01-12T23:12:30Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

dhellmann · 2022-01-14T19:40:48Z

/remove-lifecycle stale
/approve

openshift-ci · 2022-01-14T19:41:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhellmann

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [dhellmann]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

philipgough · 2022-01-14T21:15:12Z

/lgtm

openshift-ci · 2022-01-14T21:22:50Z

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot requested review from staebler and sudhaponnaganti December 15, 2021 20:25

wking commented Dec 15, 2021

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 12, 2022

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2022

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2022

openshift-ci bot assigned philipgough Jan 14, 2022

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2022

openshift-merge-robot merged commit fd603fd into openshift:master Jan 14, 2022

wking deleted the fix-rate-to-irate branch January 14, 2022 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONVENTIONS: Update CPU query sum_irate #988

CONVENTIONS: Update CPU query sum_irate #988

wking commented Dec 15, 2021

wking Dec 15, 2021 •

edited

Loading

openshift-bot commented Jan 12, 2022

dhellmann commented Jan 14, 2022

openshift-ci bot commented Jan 14, 2022

philipgough commented Jan 14, 2022

openshift-ci bot commented Jan 14, 2022

CONVENTIONS: Update CPU query sum_irate #988

CONVENTIONS: Update CPU query sum_irate #988

Conversation

wking commented Dec 15, 2021

wking Dec 15, 2021 • edited Loading

Choose a reason for hiding this comment

openshift-bot commented Jan 12, 2022

dhellmann commented Jan 14, 2022

openshift-ci bot commented Jan 14, 2022

philipgough commented Jan 14, 2022

openshift-ci bot commented Jan 14, 2022

wking Dec 15, 2021 •

edited

Loading