Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify descheduler handling of StatefulSets #519

Closed
1 of 3 tasks
lindhe opened this issue Mar 8, 2021 · 17 comments
Closed
1 of 3 tasks

Clarify descheduler handling of StatefulSets #519

lindhe opened this issue Mar 8, 2021 · 17 comments
Assignees
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@lindhe
Copy link

lindhe commented Mar 8, 2021

It's clear to me from the documentation that "Pods with local storage are never evicted (unless evictLocalStoragePods: true is set)", which makes sense since that indicates that the application is stateful and thus cannot be restarted as surely as a stateful app would. I would assume the same logic holds for pods in StatefulSets, but I find no documentation about that.

If my assumption is correct:

  • we should add that to the documentation.

If my assumption is incorrect:

  • let's clarify that in the documentation.
  • I suggest we reconsider that design choice.

If I'm missing something here, please fill me in.

@seanmalloy
Copy link
Member

Based on the below code block I believe that the descheduler will consider StatefulSets for eviction.

// IsEvictable decides when a pod is evictable
func (ev *evictable) IsEvictable(pod *v1.Pod) bool {
checkErrs := []error{}
if IsCriticalPod(pod) {
checkErrs = append(checkErrs, fmt.Errorf("pod is critical"))
}
ownerRefList := podutil.OwnerRef(pod)
if IsDaemonsetPod(ownerRefList) {
checkErrs = append(checkErrs, fmt.Errorf("pod is a DaemonSet pod"))
}
if len(ownerRefList) == 0 {
checkErrs = append(checkErrs, fmt.Errorf("pod does not have any ownerrefs"))
}
if IsMirrorPod(pod) {
checkErrs = append(checkErrs, fmt.Errorf("pod is a mirror pod"))
}
for _, c := range ev.constraints {
if err := c(pod); err != nil {
checkErrs = append(checkErrs, err)
}
}
if len(checkErrs) > 0 && !HaveEvictAnnotation(pod) {
klog.V(4).InfoS("Pod lacks an eviction annotation and fails the following checks", "pod", klog.KObj(pod), "checks", errors.NewAggregate(checkErrs).Error())
return false
}
return true
}

If I'm correct then the docs here just need to be updated to indicate that StatefulSet pods are considered for eviction in addition to RC, RS, Deployment and Job.

If the ask is to add a feature to have an option to include/exclude StatefulSet pods then this would be a feature enhancement request. Keep in mind there are already many other ways to have the descheduler include/exclude pods.

https://github.com/kubernetes-sigs/descheduler#filter-pods

Also, starting with the upcoming descheduler v0.21.0 release there will be an ignorePvcPods option in addition to the currentevictLocalStoragePods option.

@seanmalloy
Copy link
Member

I created PR #520 to document the current behavior.

@lindhe is updating the documentation good enough to close this issue or do you want to discuss a feature for excluding StatefulSets further?

From my perspective I think the descheduler has enough options to choose which pods are considered or not considered for eviction, but we welcome feedback if the current functionality does not fit your needs.

@lindhe
Copy link
Author

lindhe commented Mar 12, 2021

Cool, thanks a lot for the PR! Having it documented properly was the most important part in my view. I may continue the discussion to understand better why sts by their nature are not excluded, but if I continue that thread it can surely be done without this Issue open.

Let's mark this as closed when the PR is merged.

@seanmalloy
Copy link
Member

/assign

@seanmalloy
Copy link
Member

/kind documentation

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Mar 12, 2021
@seanmalloy
Copy link
Member

seanmalloy commented Mar 23, 2021

Tasks to complete before closing this issue:

  • documentation
  • unit tests
  • e2e tests

@ingvagabund
Copy link
Contributor

@seanmalloy do you still want to implement e2e for this?

@seanmalloy
Copy link
Member

@seanmalloy do you still want to implement e2e for this?

@ingvagabund yes I was going to wait until #535 is merged.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 14, 2021
@ingvagabund
Copy link
Contributor

@seanmalloy The PR is merged
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2021
@seanmalloy
Copy link
Member

FYI ... I'm still going to submit a PR to implement the e2e tests. I had to take a break for a bit.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2021
@pravarag
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 18, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

7 participants