Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added source label to ephemeral_storage_container_limit_percentage #97

Merged
merged 2 commits into from
Jul 2, 2024

Conversation

miminar
Copy link
Contributor

@miminar miminar commented Jul 2, 2024

in order to differentiate between "hard" and "soft" limits

"hard" comes pod.spec.containers.resources.limits["ephemeral-storage"] (if set) and terminates the pod once reached

"soft" comes from the node and may not necessarily cause an eviction of the pod in question

Motivation

To implement an alert rule à la KubePersistentVolumeFillingUp for a container like this:

        - alert: ContainerEphemeralStorageUsageAtLimit
          annotations:
            description: >-
              {{ `Ephemeral storage usage of pod/container {{ $labels.pod_name
              }}/{{ $labels.exported_container }} in Namespace {{
              $labels.pod_namespace }} on Node {{ $labels.node_name }} {{ with
              $labels.cluster -}} on Cluster {{ . }} {{- end }} is at {{ $value
              }}% of the limit.` }}
            summary: Container ephemeral storage usage is at the limit.
          expr: |-2
            ( ( max by (node_name, pod_namespace, pod_name, exported_container)
                       (avg_over_time(ephemeral_storage_container_limit_percentage[5m]))
              > 85.0)
            # ignore soft-limits (containers without an ephemeral-storage limit set
            # will report ephemeral_storage_node_percentage as their limit)
            and on (pod_namespace, pod_name, exported_container)
                   ( max_over_time(
                     ( abs(ephemeral_storage_container_limit_percentage
                     - on (node_name) group_left () ephemeral_storage_node_percentage))[5m:30s])
                   # there can be a slight deviation between the metric
                   # reported for a container and the node
                   > 0.06)
            )
            # ignore pods that haven't been running for some time (e.g. completed jobs)
            unless on (pod_namespace, pod_name)
                      ( (label_replace(label_replace(
                           sum by (namespace, pod)
                                  (max_over_time(kube_pod_status_phase{phase="Running"}[1m])),
                           "pod_namespace", "$1", "namespace", "(.*)"),
                           "pod_name", "$1", "pod", "(.*)"))
                      == 0)
          for: 1m
          labels:
            severity: warning

Notice the middle part of the expression (and on (pod_namespace, pod_name, exported_container) ( max_over_time( ... > 0.06) which makes the whole evaluation fragile, complicated and slow.

It is needed to filter out all containers with pod.spec.containers.requests.limits["ephemeral-storage"] unset.

Without it, there would be too many instances of the alert generated for a single node running out of ephemeral storage (one for each container running on the node). For that case, there should be another alert like NodeOutOfEphemeralStorage

With this change in place, one could simplify the expression to:

            ( max by (node_name, pod_namespace, pod_name, exported_container)
                       (avg_over_time(ephemeral_storage_container_limit_percentage{source="container"}[5m]))
            > 85.0)
            # ignore pods that haven't been running for some time (e.g. completed jobs)
            unless on (pod_namespace, pod_name)
                      ( (label_replace(label_replace(
                           sum by (namespace, pod)
                                  (max_over_time(kube_pod_status_phase{phase="Running"}[1m])),
                           "pod_namespace", "$1", "namespace", "(.*)"),
                           "pod_name", "$1", "pod", "(.*)"))
                      == 0)

I don't have a strong opinion on the new label name/ values. Instead of adding a new label, there could be a whole new metric instead (e.g. ephemeral_storage_container_hard_limit_percentage).

Please let me know what you think.

in order to differentiate between "hard" and "soft" limits

"hard" comes pod.spec.containers.resources.limits (if set) and
terminates the pod once reached

"soft" comes from the node and may not necessarily cause an eviction of
the pod in question

Signed-off-by: Michal Minář <michal.minar@id.ethz.ch>
Signed-off-by: Michal Minář <michal.minar@id.ethz.ch>
@jmcgrath207 jmcgrath207 self-requested a review July 2, 2024 17:44
@jmcgrath207
Copy link
Owner

Thanks @miminar. I agree. After reflecting on this, I think it would be best to go with this PR since I have already set the expectations for this metric. For ephemeral_storage_container_{hard,soft}_limit_percentage I like this for V2 but I don't want to break default/expect behavior before then.

I do have slight reservations around high cardinality risk, but we do have relabeling the user can use and at worst, I can add an option flag to disable the source label by default.

Great job! If you want to contribute your PrometheusRule after this I am open to PRs.

Thanks!

@jmcgrath207 jmcgrath207 merged commit 115e03c into jmcgrath207:master Jul 2, 2024
1 check passed
@jmcgrath207
Copy link
Owner

@miminar This is now release in 1.11.0

@miminar
Copy link
Contributor Author

miminar commented Jul 3, 2024

Thank you! It helps a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants