Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout workload when disruption budget prevents pod eviction #3808

Open
wants to merge 1 commit into
base: release/v2
Choose a base branch
from

Conversation

thallgren
Copy link
Member

@thallgren thallgren commented Mar 3, 2025

Eviction might be prevented when a disruption budget is in place because Kubernetes will not start a new pod until after the eviction. This commit introduces three ways to mitigate this problem.

  1. When multiple evictions are made, and some of those evictions have succeeded, then a good strategy is to just wait until the evicted pods are recreated, and then continue to evict others.
  2. For a deployment or argo-rollout, applying a patch with a restart annotation causes all pods to be redeployed in a manner that doesn't disrupt the budget.
  3. For replicasets and statefulsets, the approach is to scale up above the budget threshold, evict the pod, and then scale back down again.

Eviction might be prevented when a disruption budget is in place
because Kubernetes will not start a new pod before until after the
eviction. This commit introduces three ways to mitigate this problem.

1. When multiple evictions are made, and some of those evictions have
   succeeded, then a good strategy is to just wait until the evicted
   pods are recreated, and then continue to evict others.
2. For a deployment or argo-rollout, applying a patch with a restart
   annotation causes all pods to be redeployed in a manner that doesn't
   disrupt the budget.
3. For replicasets and statefulsets, the approach is to scale up above
   the budget threshold, evict the pod, and then scale back down again.

Signed-off-by: Thomas Hallgren <thomas@tada.se>
@thallgren thallgren added the ok to test Applied by maintainers when a PR is ready to have tests run on it label Mar 3, 2025
@thallgren thallgren requested review from P0lip, FuYu3699 and njayp March 3, 2025 15:22
@github-actions github-actions bot removed the ok to test Applied by maintainers when a PR is ready to have tests run on it label Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant