Add 'namespace' parameter support to LowNodeUtilization strategy #391

korjek · 2020-09-02T19:40:20Z

Hello.

I think that would be great if 'namespace' parameter is supported with LowNodeUtilization strategy.

seanmalloy · 2020-09-03T04:37:04Z

@korjek thanks for opening this issue. I'm assuming you are referring to the namespace filtering feature that was just added in the v0.19.0 release.

When the namespace filtering enhancement was initially discussed it was determined that it would not make sense to enable this for all descheduler strategies. For example for the LowNodeUtilization strategy to be effective it needs to be able to process as many nodes as possible.

Can you provide any additional details on your use case?

korjek · 2020-09-04T15:15:35Z

@seanmalloy thank you for your reply. You are right, I meant namespace filtering.

What I got from a documentation namespace filtering is enabled for all strategies, but LowNodeUtilization.
Our usecase is that we have a separate namespace where idempotent to restart pods are running. So the idea with using namespace filtering with LowNodeUtilization strategy is that Descheduler will evict pods only from that specific namespace.

Another solution that I can think of for our use case: to specify the lowest priority for all pods in that namespace and use thresholdPriorityClassName filtering. But a drawback of doing this: in case we lost nodes and there is no enough capacity to schedule all pods across the cluster, we would like to have pods on this namespace scheduled and some other pods evicted by kube-scheduler. So using the lowest priority in this case is not a good idea.

That's why we think it would be great to have namespace filtering support with LowNodeUtilization strategy.

seanmalloy · 2020-09-04T19:10:25Z

@korjek are the pods running in this namespace running on a different set of nodes? Could you maybe filter using a node selector?

korjek · 2020-09-05T09:14:46Z

@seanmalloy no, they are using the same nodes pool.

But what's the reason to do not have namespace filtering for LowNodeUtilization strategy?

damemi · 2020-09-10T15:51:06Z

LowNodeUtilization needs to consider as many pods as possible to accurately calculate each node's resource consumption. Only considering a subset of these for eviction would hurt its effectiveness at balancing cluster resource usage.

Can you give some more details on what your use case is? Perhaps there is another approach with the scheduler/descheduler that would fit better

korjek · 2020-09-11T08:21:30Z

@damemi but why the descheduler needs to calculate node's resource consumptions if that info is available from node's status?

How I see this: when LowNodeUtilization strategy generates a list of pods for eviction it should take into account namespace filtering (ie when evaluating pods for eviction, skip pods that are filtered by namespace). The rest of the logic shouldn't be changed.

Here is my usecase #391 (comment)

damemi · 2020-09-14T13:14:42Z

If you limit the pods that are available for eviction with LowNodeUtilization, the strategy won't work very well. The point of it is to balance entire node usage, so unless the namespace you're restricting it to makes up a large portion of your cluster's consumption then there isn't much point to running this strategy in this namespace.

By your use case, I meant why do you specifically want to run LowNodeUtilization in one namespace? I understand that you want to restrict evictions to certain idempotent pods, but are you just trying to keep these pods spread out between nodes? Are they actually a large portion of your node's usage?

If we can know your end goal with this strategy, we can better determine if this is a real use case or if something else is more appropriate (like Pod Topology Spreading or RemoveDuplicates)

ingvagabund · 2020-09-16T12:23:26Z

Our usecase is that we have a separate namespace where idempotent to restart pods are running. So the idea with using namespace filtering with LowNodeUtilization strategy is that Descheduler will evict pods only from that specific namespace.

IIUC, the specific namespace is like a bank of pods to be sacrificed when certain nodes starts to be overutilized. Assuming what you are asking for is implemented, in case no pod from the specific namespace runs on any of the overutlized nodes, the strategy will have no effect. More or less the same holds if a small fraction of pods from the namespace runs on the overutilized nodes (debatable, yet depending on a particular use case).

I can see how it aligns with LowNodeUtilization strategy though unless the bank of pods are concentrated around overutilized nodes, I don't see any other benefit. I can see the case where a set of nodes have a main application (each). Each main application concentrates a bank of workers on the same node (in the specific namespace). In case the worker pods start to eat too much cpu/memory (upper bounded by resource limits), nodes can get overutlized in which case it makes sense to reschedule/evict only some of the worker pods from the bank. @korjek Is this use case close enough?

korjek · 2020-09-23T19:41:11Z

@damemi
Yes, these pods are the biggest portion of node's usage (~90% of all cluster).
Of course I can go and update PriorityClass for pods in this namespace or pods on other namespaces (or both at the same time) and use PriorityClass filter. But using namespace filtering requires less effort in this case.

korjek · 2020-09-23T19:46:09Z

IIUC, the specific namespace is like a bank of pods to be sacrificed when certain nodes starts to be overutilized. Assuming what you are asking for is implemented, in case no pod from the specific namespace runs on any of the overutlized nodes, the strategy will have no effect. More or less the same holds if a small fraction of pods from the namespace runs on the overutilized nodes (debatable, yet depending on a particular use case).

But that's actual for PriorityClass filtering too: if all pods except a small amount are using higher PriorityClass than specified using PriorityClass filter, strategy will have no effect.

I can see how it aligns with LowNodeUtilization strategy though unless the bank of pods are concentrated around overutilized nodes, I don't see any other benefit. I can see the case where a set of nodes have a main application (each). Each main application concentrates a bank of workers on the same node (in the specific namespace). In case the worker pods start to eat too much cpu/memory (upper bounded by resource limits), nodes can get overutlized in which case it makes sense to reschedule/evict only some of the worker pods from the bank. @korjek Is this use case close enough?

Almost. The main load (~90%) are caused by pods running in one namespace. So once AWS reclaims node, other nodes become really overloaded.
As I relied to @damemi , I can handle my case using PriorityClass filter, but it will require more changes from my side.

damemi · 2020-09-24T13:47:47Z

The main load (~90%) are caused by pods running in one namespace. So once AWS reclaims node, other nodes become really overloaded.

This is the key here, and I think it's a legitimate use case. Ultimately there isn't much "risk" with enabling namespace filtering for this strategy, the only risk is that if used improperly it will hurt the strategy's effectiveness (or maybe cause it to hotloop/run every iteration with no results but we could probably get around this by only listing the viable pods when calculating usage). If we do enable this, those outcomes should be clearly noted in the docs that it is not normally recommended to run this strategy with namespace filtering.

@seanmalloy @ingvagabund wdyt?

seanmalloy · 2020-09-26T03:02:55Z

The main load (~90%) are caused by pods running in one namespace. So once AWS reclaims node, other nodes become really overloaded.

This is the key here, and I think it's a legitimate use case. Ultimately there isn't much "risk" with enabling namespace filtering for this strategy, the only risk is that if used improperly it will hurt the strategy's effectiveness (or maybe cause it to hotloop/run every iteration with no results but we could probably get around this by only listing the viable pods when calculating usage). If we do enable this, those outcomes should be clearly noted in the docs that it is not normally recommended to run this strategy with namespace filtering.

@seanmalloy @ingvagabund wdyt?

@damemi I'm fine with this as long as it is clearly documented.

ingvagabund · 2020-09-27T12:26:15Z

Semantics of filtering namespaces through Namespaces field here is different from other cases. The expected semantics is filter pods and then run strategies. In this case it runs a strategy and then filter pods. I am inclined to introduce a new field, e.g. EvictableNamespaces to distinguish the difference and have it "only" include namespace without excluding.

damemi · 2020-09-28T00:39:39Z

I'm not sure this requires a new field, it would probably be sufficient to document the difference because it's just for one strategy.

ingvagabund · 2020-09-29T07:44:39Z

because it's just for one strategy.

two once HighNodeUtilization strategy gets implemented

lixiang233 · 2020-09-30T01:58:35Z

This strategy is different as it's a strategy for Node, I think Namespace parameter in this strategy can only be understood as EvictableNamespaces(so we needn't a new parameter), because node's utilization is a property of Node, and filtering pods shouldn't change this property, Namespace parameter here can only be used to limit the scope of evictable pods. We can mention this in README to prevent misunderstanding.

ingvagabund · 2020-09-30T08:51:23Z

Nobody reads a documentation unless it's really needed. I will not argue against reusing Namespaces field. Just wanted to voice my concerns.

fejta-bot · 2020-12-29T09:03:01Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2021-01-28T09:46:07Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

lixiang233 · 2021-01-28T09:50:06Z

/remove-lifecycle rotten

lindhe · 2021-03-02T10:27:06Z

I was fooled by this too. I assumed the filter to let me have some namespaces to be untouched by descheduler, but that's clearly not the case.

damemi · 2021-03-02T17:51:27Z

I was fooled by this too. I assumed the filter to let me have some namespaces to be untouched by descheduler, but that's clearly not the case.

I think that adding a top-level setting for excluding namespaces is a valid request. In OpenShift, we currently have our operator manually append system namespaces to each strategy's exclude list and a cluster-wide setting would be more convenient.

I also think that excluding namespaces from LowNodeUtilization has (a) more utility and (b) less risk as opposed to the includeNamespaces for it. So, I have no problem excluding namespaces from LNU with a clear note that using it too broadly can have negative effects on the strategy.

eagle9527 · 2021-03-03T01:01:21Z

LowNodeUtilization filters through the namespace, which does not affect applications in other namespaces. If the entire cluster is used, production applications will be unavailable.

ingvagabund · 2021-03-04T11:44:21Z

+1 for allowing to set a list of excluded namespaces only yet clearly expressing risks of misusing it. Also stressing the setting applies in eviction phase only.

The strategy evicts pods in order to move pods from overutilized nodes to underutilized ordering pods from the least priority to the most. In case the list of excluded namespaces is configured improperly, some of the least priority pods might stay and pods with higher priority might get evicted instead. Which still improves resource usage among nodes, though not always respecting the priority. Description of LowNodeUtilization strategy does not mention the priority ordering is respected during eviction. Worth mentioning what behavior each strategy actually guarantees. https://github.com/kubernetes-sigs/descheduler#pod-evictions mentions the ordering though.

Worth adding another do-not-respect-priority-ordering alike param to the strategy. So anyone who wants to use the list of excluded namespaces has to set the param and take into account the fact the pod priority may not be respected. Which can violate guarantees of the strategy otherwise. Unfortunately, we don't have a notion of experimental features here. Making some params as alpha/experimental.

fejta-bot · 2021-06-02T12:02:51Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fejta-bot · 2021-07-02T12:44:12Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

k8s-triage-robot · 2021-08-01T13:27:56Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

k8s-ci-robot · 2021-08-01T13:27:59Z

@k8s-triage-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

korjek · 2021-08-02T06:03:01Z

/reopen

k8s-ci-robot · 2021-08-02T06:03:06Z

@korjek: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2021-09-01T19:43:13Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2021-09-01T19:43:21Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

seanmalloy · 2021-10-01T03:22:52Z

/reopen

k8s-ci-robot · 2021-10-01T03:23:00Z

@seanmalloy: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2021-10-31T04:03:03Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2021-10-31T04:03:15Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

korjek added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 2, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 29, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 28, 2021

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 28, 2021

damemi mentioned this issue Mar 2, 2021

Exclude namespace is invalid #511

Closed

damemi mentioned this issue May 21, 2021

Enable 'Namespace filtering' for 'LowNodeUtilization' strategy #573

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 2, 2021

k8s-ci-robot closed this as completed Aug 1, 2021

k8s-ci-robot reopened this Aug 2, 2021

k8s-ci-robot closed this as completed Sep 1, 2021

k8s-ci-robot reopened this Oct 1, 2021

k8s-ci-robot closed this as completed Oct 31, 2021

damemi mentioned this issue Feb 8, 2022

Proposal: Fail Fast Validation #701

Closed

damemi mentioned this issue Sep 28, 2022

add namespace filter to nodeutilization #967

Merged

Add 'namespace' parameter support to LowNodeUtilization strategy #391

Add 'namespace' parameter support to LowNodeUtilization strategy #391

Comments

korjek commented Sep 2, 2020

seanmalloy commented Sep 3, 2020

korjek commented Sep 4, 2020 • edited Loading

seanmalloy commented Sep 4, 2020

korjek commented Sep 5, 2020

damemi commented Sep 10, 2020

korjek commented Sep 11, 2020

damemi commented Sep 14, 2020

ingvagabund commented Sep 16, 2020 • edited Loading

korjek commented Sep 23, 2020

korjek commented Sep 23, 2020

damemi commented Sep 24, 2020

seanmalloy commented Sep 26, 2020

ingvagabund commented Sep 27, 2020 • edited Loading

damemi commented Sep 28, 2020 • edited Loading

ingvagabund commented Sep 29, 2020 • edited Loading

lixiang233 commented Sep 30, 2020

ingvagabund commented Sep 30, 2020

fejta-bot commented Dec 29, 2020

fejta-bot commented Jan 28, 2021

lixiang233 commented Jan 28, 2021

lindhe commented Mar 2, 2021

damemi commented Mar 2, 2021

eagle9527 commented Mar 3, 2021

ingvagabund commented Mar 4, 2021 • edited Loading

fejta-bot commented Jun 2, 2021

fejta-bot commented Jul 2, 2021

k8s-triage-robot commented Aug 1, 2021

k8s-ci-robot commented Aug 1, 2021

korjek commented Aug 2, 2021

k8s-ci-robot commented Aug 2, 2021

k8s-triage-robot commented Sep 1, 2021

k8s-ci-robot commented Sep 1, 2021

seanmalloy commented Oct 1, 2021

k8s-ci-robot commented Oct 1, 2021

k8s-triage-robot commented Oct 31, 2021

k8s-ci-robot commented Oct 31, 2021

korjek commented Sep 4, 2020 •

edited

Loading

ingvagabund commented Sep 16, 2020 •

edited

Loading

ingvagabund commented Sep 27, 2020 •

edited

Loading

damemi commented Sep 28, 2020 •

edited

Loading

ingvagabund commented Sep 29, 2020 •

edited

Loading

ingvagabund commented Mar 4, 2021 •

edited

Loading