Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cpu resource metric type by changing to counter #89540

Merged
merged 1 commit into from
Mar 27, 2020

Conversation

dashpole
Copy link
Contributor

What type of PR is this?
/kind bug

What this PR does / why we need it:
CPU metrics should be counters, not gauges. When I initially created the kubelet resource metrics endpoint in #73946, I mistakenly made CPU metrics a gauge. When this was refactored to make use of the metrics stability framework in #86282, the cpu metrics were renamed to omit the _total suffix, which caught the attention of some sig instrumentation members.

Since the endpoint was newly added in 1.18, we have decided to just fix the metric without a deprecation period, and cherry-pick the fix back to the 1.18 release.

Does this PR introduce a user-facing change?:

In the kubelet resource metrics endpoint at /metrics/resource, change the names of the following metrics:
- node_cpu_usage_seconds --> node_cpu_usage_seconds_total
- container_cpu_usage_seconds --> container_cpu_usage_seconds_total
This is a partial revert of #86282, which was added in 1.18.0, and initially removed the _total suffix

/assign @brancz @logicalhan @serathius @ehashman
/hold
to make sure we are all in agreement about this change
cc @kubernetes/sig-instrumentation-bugs

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 26, 2020
@k8s-ci-robot k8s-ci-robot added the sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. label Mar 26, 2020
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 26, 2020
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 26, 2020
@dashpole
Copy link
Contributor Author

/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 26, 2020
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2020
Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 26, 2020
@serathius
Copy link
Contributor

/lgtm
cc @x13n

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@RainbowMango
Copy link
Member

I'm the guys who omitted the _total suffix because as per promlint non-counter metrics should not have "_total" suffix.
Sorry about that. And please @brancz confirm about this change.
/lgtm

@brancz
Copy link
Member

brancz commented Mar 27, 2020

/lgtm
/approve

I think we are now in consensus @dashpole, I think you can now feel free to remove the hold label.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, dashpole

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@RainbowMango
Copy link
Member

/hold cancel
We are all on the same page now.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 27, 2020
@ehashman
Copy link
Member

/test pull-kubernetes-e2e-gce

1 similar comment
@dashpole
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants