added source label to ephemeral_storage_container_limit_percentage #97
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
in order to differentiate between "hard" and "soft" limits
"hard" comes
pod.spec.containers.resources.limits["ephemeral-storage"]
(if set) and terminates the pod once reached"soft" comes from the node and may not necessarily cause an eviction of the pod in question
Motivation
To implement an alert rule à la KubePersistentVolumeFillingUp for a container like this:
Notice the middle part of the expression (
and on (pod_namespace, pod_name, exported_container) ( max_over_time( ... > 0.06)
which makes the whole evaluation fragile, complicated and slow.It is needed to filter out all containers with
pod.spec.containers.requests.limits["ephemeral-storage"]
unset.Without it, there would be too many instances of the alert generated for a single node running out of ephemeral storage (one for each container running on the node). For that case, there should be another alert like
NodeOutOfEphemeralStorage
With this change in place, one could simplify the expression to:
I don't have a strong opinion on the new label name/ values. Instead of adding a new label, there could be a whole new metric instead (e.g.
ephemeral_storage_container_hard_limit_percentage
).Please let me know what you think.