Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node CPU/Memory metrics differ from kubectl describe #27262

Closed
jefchien opened this issue Sep 28, 2023 · 3 comments
Closed

Node CPU/Memory metrics differ from kubectl describe #27262

jefchien opened this issue Sep 28, 2023 · 3 comments
Assignees

Comments

@jefchien
Copy link
Contributor

Component(s)

receiver/awscontainerinsight

What happened?

Description

In certain scenarios where there are terminated pods that are not cleaned up, the node_cpu_request and node_cpu_reserved_capacity do not match the outputs of kubectl describe node <node_name>. The main reason is due to the differences in the way the node_cpu_requests are calculated (node_cpu_reserved_capacity is derived from the node_cpu_request). In the receiver, the CPU request is an aggregate of all pods on the node.

tmpCPUReq, _ := getResourceSettingForPod(&pod, p.nodeInfo.getCPUCapacity(), cpuKey, getRequestForContainer)
cpuRequest += tmpCPUReq
tmpMemReq, _ := getResourceSettingForPod(&pod, p.nodeInfo.getMemCapacity(), memoryKey, getRequestForContainer)
memRequest += tmpMemReq

Whereas in kubectl, describe node filters out terminated pods.

fieldSelector, err := fields.ParseSelector("spec.nodeName=" + name + ",status.phase!=" + string(corev1.PodSucceeded) + ",status.phase!=" + string(corev1.PodFailed))

https://github.com/kubernetes/kubectl/blob/302f330c8712e717ee45bbeff27e1d3008da9f00/pkg/describe/describe.go#L3624

The same behavior is present for memory requests metric.

Steps to Reproduce

Create a k8s cluster and run kubectl apply -f cpu-test.yaml where cpu-test.yaml is

apiVersion: v1
kind: Pod
metadata:
  name: cpu-test
  namespace: default
spec:
  containers:
  - name: cpu-test
    image: progrium/stress
    resources:
      requests:
        cpu: "0.5"
      limits:
        cpu: "1"
    args: ["--cpu", "2", "--timeout", "60"]
  restartPolicy: Never

This will create a pod that will request CPU and then timeout and terminate.

With kubectl describe node <node-name>:

While the cpu-test is Running

Non-terminated Pods:          (7 in total)
  Namespace                   Name                             CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                             ------------  ----------  ---------------  -------------  ---
  amazon-cloudwatch           cloudwatch-agent-kqdcl           200m (10%)    200m (10%)  200Mi (6%)       200Mi (6%)     3h38m
  default                     busybox                          0 (0%)        0 (0%)      0 (0%)           0 (0%)         135m
  default                     cpu-test                         500m (25%)    1 (51%)     0 (0%)           0 (0%)         2m4s
  default                     memory-test                      0 (0%)        0 (0%)      100Mi (3%)       200Mi (6%)     11m
  kube-system                 aws-node-m58l6                   25m (1%)      0 (0%)      0 (0%)           0 (0%)         4h34m
  kube-system                 kube-proxy-c7p5v                 100m (5%)     0 (0%)      0 (0%)           0 (0%)         4h34m
  kube-system                 metrics-server-5b4fc487-25ckg    100m (5%)     0 (0%)      200Mi (6%)       0 (0%)         157m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         925m (47%)   1200m (62%)
  memory                      500Mi (15%)  400Mi (12%)

After the cpu-test has Completed

Non-terminated Pods:          (6 in total)
  Namespace                   Name                             CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                             ------------  ----------  ---------------  -------------  ---
  amazon-cloudwatch           cloudwatch-agent-kqdcl           200m (10%)    200m (10%)  200Mi (6%)       200Mi (6%)     3h39m
  default                     busybox                          0 (0%)        0 (0%)      0 (0%)           0 (0%)         136m
  default                     memory-test                      0 (0%)        0 (0%)      100Mi (3%)       200Mi (6%)     12m
  kube-system                 aws-node-m58l6                   25m (1%)      0 (0%)      0 (0%)           0 (0%)         4h35m
  kube-system                 kube-proxy-c7p5v                 100m (5%)     0 (0%)      0 (0%)           0 (0%)         4h35m
  kube-system                 metrics-server-5b4fc487-25ckg    100m (5%)     0 (0%)      200Mi (6%)       0 (0%)         157m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         425m (22%)   200m (10%)
  memory                      500Mi (15%)  400Mi (12%)

Expected Result

The expectation is that the metric would match kubectl and drop back down to 425/22% after the pod has completed.

Actual Result

In this case, the metric remains at 925/47%.

    "node_cpu_request": 925,
    "node_cpu_reserved_capacity": 46.25,

image

Collector version

v0.77.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") AL2
Compiler(if manually compiled): (e.g., "go 14.2") go 1.20

Basic EKS cluster created with eksctl.

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@jefchien jefchien added bug Something isn't working needs triage New item requiring triage labels Sep 28, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@jefchien
Copy link
Contributor Author

Please assign this to me.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 28, 2023
TylerHelmuth pushed a commit that referenced this issue Dec 8, 2023
…st metrics. (#27299)

**Description:** The `node_<cpu|memory>_request` metrics and metrics
derived from them (`node_<cpu|memory>_reserved_capacity`) differ from
the output of `kubectl describe node <node_name>`. This is because
kubectl [filters out terminated
pods](https://github.com/kubernetes/kubectl/blob/302f330c8712e717ee45bbeff27e1d3008da9f00/pkg/describe/describe.go#L3624).
See linked issue for more details.

Adds a filter for terminated (succeeded/failed state) pods. 

**Link to tracking Issue:**
#27262

**Testing:** Added unit test to validate pod state filtering. Built and
deployed changes to cluster. Deployed `cpu-test` pod.


![image](https://github.com/amazon-contributing/opentelemetry-collector-contrib/assets/84729962/b557be2d-e14e-428a-895a-761f7724d9bd)


The gap is when the change was deployed. The metric drops after the
deployment due to the filter. The metric can be seen spiking up while
the `cpu-test` pod is running (~19:15) and then returns to the previous
request size after it has terminated.

**Documentation:** N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants