Improve LowNodeUtilization's consideration of evictable pods #529

damemi · 2021-03-17T16:11:01Z

Similar to how we check that a pod fits on other nodes before evicting in strategies like Affinity/TopologySpread, it would be an improvement if LowNodeUtilization checked that a pod from an overutilized node can fit on the intended underutilized node before evicting it (ie, check for taints/affinity). This could improve the performance of the strategy and prevent unnecessary infinite evictions each run.

RyanDevlin · 2021-04-14T23:57:10Z

I can try this one.

RyanDevlin · 2021-04-14T23:57:14Z

/assign

RyanDevlin · 2021-04-15T12:45:38Z

@damemi I've noticed that the Affinity and TopologySpread strategies check for affinity on other nodes, but don't necessarily check for taints. Is checking for taints on other nodes still something that should be included in this feature?

rustrial · 2021-04-21T07:18:13Z

@RyanDevlin I guess eviction only makes sense if the Pod can be scheduled on another node, therefore it seems reasonable that all scheduling constraints (Affinity, TopologySpread and Taints) should be checked. Otherwise we would just evict a Pod to see it being scheduled to the same node again, which just increases the risk of service outages caused by Pod eviction (restart).

RyanDevlin · 2021-04-21T12:29:56Z

@rustrial At the bottom of #551 @ingvagabund commented about how, for the purposes of this feature, it would be overkill to take into account every filtering plugin. I'm currently implementing this feature with checks for NodeAffinity, Taints, and NodeSelector. The optimization isn't perfect, but statistically it should improve performance.

ingvagabund · 2021-04-21T14:29:53Z

Overkill on the code level view (duplicating the code since we can't import kubernetes/kubernetes code).

ingvagabund · 2021-04-21T14:34:15Z

Otherwise we would just evict a Pod to see it being scheduled to the same node again, which just increases the risk of service outages caused by Pod eviction (restart).

That's responsibility of PDB to take care of disruptions.

damemi · 2021-04-21T14:41:19Z

@RyanDevlin I guess eviction only makes sense if the Pod can be scheduled on another node

I mentioned this in the other thread (#551 (comment)), sorry for not bringing it up here... but there are use cases where a pod could be evicted regardless of if it fits on another node. For strategies like PodLifetime, you might actually want to evict a pod that will only be recreated on the same node

fejta-bot · 2021-07-20T15:33:42Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

ingvagabund · 2021-07-21T08:45:06Z

/remove-lifecycle stale

k8s-triage-robot · 2021-10-19T09:37:32Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

pravarag · 2021-10-19T09:48:37Z

/remove-lifecycle stale

k8s-triage-robot · 2022-01-17T10:44:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ingvagabund · 2022-01-17T11:38:42Z

/remove-lifecycle stale

k8s-triage-robot · 2022-04-17T12:08:43Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

jecnua · 2022-04-17T15:07:33Z

Still valid

damemi · 2022-04-18T14:22:54Z

Bumped the NodeFit PR to get that moving again
/remove-lifecycle stale

ingvagabund · 2022-05-04T07:56:33Z

Fixed in #790
/close

k8s-ci-robot · 2022-05-04T07:56:42Z

@ingvagabund: Closing this issue.

In response to this:

Fixed in #790
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

damemi added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 17, 2021

k8s-ci-robot assigned RyanDevlin Apr 14, 2021

damemi mentioned this issue Apr 15, 2021

RemoveDuplicates: take taints and node selector into account when computing the number of duplicates #551

Closed

RyanDevlin mentioned this issue Apr 30, 2021

Working nodeFit feature #559

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 20, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 21, 2021

damemi mentioned this issue Aug 2, 2021

TopologySpreadConstraint should take target node's requests allocation into consideration before evicting #604

Closed

This was referenced Sep 30, 2021

Added request considerations to NodeFit Feature #635

Closed

Added request considerations to NodeFit Feature #636

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 19, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 19, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2022

ingvagabund mentioned this issue May 4, 2022

Added request considerations to NodeFit Feature [#636 follow up] #790

Merged

k8s-ci-robot closed this as completed May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LowNodeUtilization's consideration of evictable pods #529

Improve LowNodeUtilization's consideration of evictable pods #529

damemi commented Mar 17, 2021

RyanDevlin commented Apr 14, 2021

RyanDevlin commented Apr 14, 2021

RyanDevlin commented Apr 15, 2021

rustrial commented Apr 21, 2021

RyanDevlin commented Apr 21, 2021

ingvagabund commented Apr 21, 2021

ingvagabund commented Apr 21, 2021

damemi commented Apr 21, 2021

fejta-bot commented Jul 20, 2021

ingvagabund commented Jul 21, 2021

k8s-triage-robot commented Oct 19, 2021

pravarag commented Oct 19, 2021

k8s-triage-robot commented Jan 17, 2022

ingvagabund commented Jan 17, 2022

k8s-triage-robot commented Apr 17, 2022

jecnua commented Apr 17, 2022

damemi commented Apr 18, 2022

ingvagabund commented May 4, 2022

k8s-ci-robot commented May 4, 2022

Improve LowNodeUtilization's consideration of evictable pods #529

Improve LowNodeUtilization's consideration of evictable pods #529

Comments

damemi commented Mar 17, 2021

RyanDevlin commented Apr 14, 2021

RyanDevlin commented Apr 14, 2021

RyanDevlin commented Apr 15, 2021

rustrial commented Apr 21, 2021

RyanDevlin commented Apr 21, 2021

ingvagabund commented Apr 21, 2021

ingvagabund commented Apr 21, 2021

damemi commented Apr 21, 2021

fejta-bot commented Jul 20, 2021

ingvagabund commented Jul 21, 2021

k8s-triage-robot commented Oct 19, 2021

pravarag commented Oct 19, 2021

k8s-triage-robot commented Jan 17, 2022

ingvagabund commented Jan 17, 2022

k8s-triage-robot commented Apr 17, 2022

jecnua commented Apr 17, 2022

damemi commented Apr 18, 2022

ingvagabund commented May 4, 2022

k8s-ci-robot commented May 4, 2022