From 4fa924be085de8cbb89f172bee69b811ae571174 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Filip=20K=C5=99epinsk=C3=BD?= Date: Thu, 8 Dec 2022 01:17:51 +0100 Subject: [PATCH] Add Eviction policy for unhealthy Pods guarded by PDBs blog post --- ...6-unhealthy-pod-eviction-policy-for-pdb.md | 95 +++++++++++++++++++ 1 file changed, 95 insertions(+) create mode 100644 content/en/blog/_posts/2023-01-06-unhealthy-pod-eviction-policy-for-pdb.md diff --git a/content/en/blog/_posts/2023-01-06-unhealthy-pod-eviction-policy-for-pdb.md b/content/en/blog/_posts/2023-01-06-unhealthy-pod-eviction-policy-for-pdb.md new file mode 100644 index 0000000000000..eb3e0a6fe69b7 --- /dev/null +++ b/content/en/blog/_posts/2023-01-06-unhealthy-pod-eviction-policy-for-pdb.md @@ -0,0 +1,95 @@ +--- +layout: blog +title: "Kubernetes 1.26: Eviction policy for unhealthy pods guarded by PodDisruptionBudgets" +date: 2023-01-06 +slug: "unhealthy-pod-eviction-policy-for-pdbs" +--- + +**Authors:** Filip Křepinský (Red Hat), Morten Torkildsen (Google), Ravi Gudimetla (Apple) + + +Ensuring the disruptions to your applications do not affect its availability isn't a simple +task. With the introduction of _unhealthy pod eviction policy_ for [PodDisruptionBudgets](/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) (PDBs), +this is going to change. +In this article, we will dive deeper into what modifications were introduced for PDBs to +give application owners greater flexibility in managing disruptions. + +## What problems does this solve? + +Eviction of pods respects PodDisruptionBudgets (PDBs). This means that the eviction of a pod +should not disrupt a guarded application and `.status.currentHealthy` of a PDB should not fall +below `.status.desiredHealthy`. Running pods that are [Unhealthy](/docs/tasks/run-application/configure-pdb/#healthiness-of-a-pod) +do not count towards the PDB status, but eviction of these is only possible in case the application +is not disrupted. This helps disrupted or not yet started application to achieve availability +as soon as possible without additional downtime that would be caused by evictions. + +Unfortunately, this poses a problem for cluster administrators that would like to drain nodes +without any manual interventions. Misbehaving applications with pods in `CrashLoopBackOff` +state (due to a bug or misconfiguration) or pods that are simply failing to become ready +make this task much harder. + +On the other hand there are users that depend on the existing behavior, in order to: +- prevent data-loss that would be caused by deleting pods that are guarding an underlying resource or storage +- achieve the best availability possible for their application + +Kubernetes 1.26 introduced a new experimental field to the PodDisruptionBudget API: `.spec.unhealthyPodEvictionPolicy`. +When enabled, this field lets you support both of those requirements. + +## How does it work? + +There are two policies `IfHealthyBudget` and `AlwaysAllow` to choose from. +The former, `IfHealthyBudget`, follows the existing behavior to achieve the best availability +that you get by default. +We think that `AlwaysAllow` will often be a better choice, but for some critical workloads you may +still prefer to protect even unhealthy Pods from node drains or other forms of API-initiated +eviction. + +By setting the `spec.unhealthyPodEvictionPolicy` field of your PDB to `AlwaysAllow`, +you are choosing the best effort availability for your application. +This will make it easier to maintain and upgrade your clusters. + +Eviction API will consider these policies when eviction of a pod that is guarded by a PDB is requested. + + +## How do I use it? + +This is an alpha feature, which means you have to enable the `PDBUnhealthyPodEvictionPolicy` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +to enable it, with the command line argument `--feature-gates=PDBUnhealthyPodEvictionPolicy=true` +to the kube-apiserver. + +Here's an example. Assume that you've enabled the feature gate in your cluster, and that you +already defined a Deployment that runs a plain webserver. You labelled the Pods for that +Deployment with `app: nginx`. +You want to limit avoidable disruption, and you know that best effort availability is +sufficient for this app. +You decide to allow evictions even if those webserver pods are unhealthy. +You create a PDB to guard this application, with the `AlwaysAllow` policy for evicting +unhealthy pods: + +```yaml +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: nginx-pdb +spec: + selector: + matchLabels: + app: nginx + maxUnavailable: 1 + unhealthyPodEvictionPolicy: AlwaysAllow +``` + + +## How can I learn more? + + +- Read the KEP: [Unhealthy Pod Eviction Policy for PDBs](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3017-pod-healthy-policy-for-pdb) +- Read the documentation: [Unhealthy Pod Eviction Policy](/docs/tasks/run-application/configure-pdb/#unhealthy-pod-eviction-policy) for PodDisruptionBudgets +- Review the Kubernetes documentation for [PodDisruptionBudgets](docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets), [Draining of Nodes](docs/tasks/administer-cluster/safely-drain-node/) and [Eviction](docs/concepts/scheduling-eviction/api-eviction/) + + +## How do I get involved? + +If you have any feedback, please reach out to us in the [#sig-apps](https://kubernetes.slack.com/archives/C18NZM5K9) channel on Slack (visit https://slack.k8s.io/ for an invitation if you need one), or on the SIG Apps mailing list: kubernetes-sig-apps@googlegroups.com +