RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option #245

nshekhar221 · 2020-02-28T08:26:00Z

While trying to run Descheduler with RemovePodsViolatingNodeTaints policy and --descheduling-interval option set to 5m , we are observing that descheduler is caching the nodes status/taints etc. at its first run and that cache did not get updated in the subsequent runs.

Due to this, any changes made to nodes taints(after the descheduler's first run) did not get picked up by Descheduler and hence pods are not getting Evicted from that node.

The text was updated successfully, but these errors were encountered:

nshekhar221 · 2020-02-28T08:44:29Z

Some logs for reference -

I0228 06:10:30.925414 1 reflector.go:432] pkg/mod/k8s.io/client-go@v0.17.0/tools/cache/reflector.go:108: Watch close - *v1.Node total 51 items received
I0228 06:10:39.869716 1 reflector.go:278] pkg/mod/k8s.io/client-go@v0.17.0/tools/cache/reflector.go:108: forcing resync
I0228 06:10:53.968840 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000g"
I0228 06:10:53.986100 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000i"
I0228 06:10:54.070078 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000e"
I0228 06:10:54.081994 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000h"
I0228 06:10:54.095838 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000f"
I0228 06:10:54.166821 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000j"
I0228 06:11:54.185717 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000g"
I0228 06:11:54.204834 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000i"
I0228 06:11:54.222188 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000e"
I0228 06:11:54.266629 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000h"
I0228 06:11:54.279198 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000f"
I0228 06:11:54.290507 1 node_taint.go:48] Processing node: "vmss-agent-worker-nshekhartest-tjmrw00000j"

$ kubectl describe node vmss-agent-worker-nshekhartest-tjmrw00000i | grep Taint
Taints: node.kubernetes.io/network-unavailable:NoSchedule

The taint node.kubernetes.io/network-unavailable:NoSchedule is added on the node vmss-agent-worker-nshekhartest-tjmrw00000i after the descheduler is started and as we can see in the logs that change is not getting picked up by Descheduler (in subsequent runs).

seanmalloy · 2020-02-29T05:08:52Z

/kind bug

damemi · 2020-03-02T15:48:01Z

This looks like a valid bug, when the descheduler starts up we first load the list of nodes which is then passed into the strategy for every loop. I imagine similar bugs can affect the other strategies because of this.

Perhaps it would be better to move the section of code that loads the nodes into the Wait loop to make sure we have a fresh list every time. Or maybe a better option would be to use an informer with pointers to this node list so that it's only updated when it needs to be

aveshagarwal · 2020-03-02T15:59:40Z

The reason the list was fetched just once because of initial design where it was supposed to be run just once as a job or cronjob.

Due to introduction of the interval option, the list needs to be fetched inside the loop for each new iteration, so that each iteration is fresh in itself.

We should also make sure that any changes in this regard should not break those scenario where interval option is not being used.

aveshagarwal · 2020-03-02T16:01:37Z

Also running descheduler every 5m might be good for experimental purposes, but does not seem very practical. IOWs, it should not be required to balance a cluster every 5m in general or atleast in most cases.

ingvagabund · 2020-03-02T16:13:37Z

/assign

nshekhar221 · 2020-03-02T16:24:25Z

Also running descheduler every 5m might be good for experimental purposes, but does not seem very practical. IOWs, it should not be required to balance a cluster every 5m in general or atleast in most cases.

@aveshagarwal I am experimenting a combined setup of Node Problem Detector and Descheduler, where NPD will taint any faulty nodes in the cluster and Descheduler can drain PODs from that faulty node (via RemovePodsViolatingNodeTaints policy).

Increasing time interval between consecutive runs of Descheduler will lead to increase in time of faulty node detection and remediation.

dharmab · 2020-07-18T04:05:38Z

Just wanted to post an update that we've gotten the NPD+Descheduler Wombo Combo working in production and it seems to work pretty well. Would it be useful to add this use case to any documentation?

seanmalloy · 2020-07-18T14:54:26Z

Just wanted to post an update that we've gotten the NPD+Descheduler Wombo Combo working in production and it seems to work pretty well. Would it be useful to add this use case to any documentation?

@dharmab yes it would be useful to document this real world use case. It would be great if you could submit a PR to update docs/user-guide.md with the details. Thanks!

nshekhar221 changed the title ~~RemovePodsViolatingNodeAffinity policy not working with --descheduling-interval option~~ RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option Feb 28, 2020

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 29, 2020

k8s-ci-robot assigned ingvagabund Mar 2, 2020

ingvagabund mentioned this issue Mar 3, 2020

List nodes through informer in every iteration #249

Merged

k8s-ci-robot closed this as completed in #249 Mar 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option #245

RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option #245

nshekhar221 commented Feb 28, 2020 •

edited

Loading

nshekhar221 commented Feb 28, 2020

seanmalloy commented Feb 29, 2020

damemi commented Mar 2, 2020

aveshagarwal commented Mar 2, 2020 •

edited

Loading

aveshagarwal commented Mar 2, 2020 •

edited

Loading

ingvagabund commented Mar 2, 2020

nshekhar221 commented Mar 2, 2020

dharmab commented Jul 18, 2020

seanmalloy commented Jul 18, 2020

RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option #245

RemovePodsViolatingNodeTaints policy not working with --descheduling-interval option #245

Comments

nshekhar221 commented Feb 28, 2020 • edited Loading

nshekhar221 commented Feb 28, 2020

seanmalloy commented Feb 29, 2020

damemi commented Mar 2, 2020

aveshagarwal commented Mar 2, 2020 • edited Loading

aveshagarwal commented Mar 2, 2020 • edited Loading

ingvagabund commented Mar 2, 2020

nshekhar221 commented Mar 2, 2020

dharmab commented Jul 18, 2020

seanmalloy commented Jul 18, 2020

nshekhar221 commented Feb 28, 2020 •

edited

Loading

aveshagarwal commented Mar 2, 2020 •

edited

Loading

aveshagarwal commented Mar 2, 2020 •

edited

Loading