You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What did you do?
Installed the Helm chart without any custom values.
What did you expect to see?
Pods finisihing with state "Succeeded".
What did you see instead?
Mixture of "Succeeded" and "Failed" pods. Failing pods panic with:
panic: close of closed channel
goroutine 90 [running]:
sigs.k8s.io/descheduler/pkg/descheduler.Run.func1()
/go/src/sigs.k8s.io/descheduler/pkg/descheduler/descheduler.go:76 +0x3c
created by sigs.k8s.io/descheduler/pkg/descheduler.Run
/go/src/sigs.k8s.io/descheduler/pkg/descheduler/descheduler.go:74 +0x165
Feels like something race-conditiony. Some Pods succeed, some fail. Full log below.
Full log (redacted private pod names)
I0225 10:38:01.989179 1 named_certificates.go:53] "Loaded SNI cert" index=0 certName="self-signed loopback" certDetail="\"apiserver-loopback-client@1645785481\" [serving] validServingFor=[apiserver-loopback-client] issuer=\"apiserver-loopback-client-ca@1645785481\" (2022-02-25 09:38:01 +0000 UTC to 2023-02-25 09:38:01 +0000 UTC (now=2022-02-25 10:38:01.989142099 +0000 UTC))"
I0225 10:38:01.989228 1 secure_serving.go:200] Serving securely on [::]:10258
I0225 10:38:01.989437 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0225 10:38:02.053976 1 reflector.go:219] Starting reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.053993 1 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.054077 1 reflector.go:219] Starting reflector *v1.PriorityClass (0s) from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.054082 1 reflector.go:255] Listing and watching *v1.PriorityClass from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.053978 1 reflector.go:219] Starting reflector *v1.Namespace (0s) from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.054130 1 reflector.go:255] Listing and watching *v1.Namespace from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.154082 1 node.go:46] "Node lister returned empty list, now fetch directly"
I0225 10:38:02.168948 1 descheduler.go:266] Building a pod evictor
I0225 10:38:02.169159 1 nodeutilization.go:167] "Node is overutilized" node="aks-default-77236376-vmss000000" usage=map[cpu:1290m memory:1214Mi pods:23] usagePercentage=map[cpu:67.89473684210526 memory:22.63010614415712 pods:20.90909090909091]
I0225 10:38:02.169183 1 nodeutilization.go:167] "Node is overutilized" node="aks-default-77236376-vmss000001" usage=map[cpu:1304m memory:1395Mi pods:24] usagePercentage=map[cpu:68.63157894736842 memory:26.004117027264567 pods:21.818181818181817]
I0225 10:38:02.169202 1 nodeutilization.go:167] "Node is overutilized" node="aks-default-77236376-vmss000002" usage=map[cpu:1560m memory:1276Mi pods:18] usagePercentage=map[cpu:82.10526315789474 memory:23.78584467870221 pods:16.363636363636363]
I0225 10:38:02.169219 1 lownodeutilization.go:101] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20
I0225 10:38:02.169228 1 lownodeutilization.go:102] "Number of underutilized nodes" totalNumber=0
I0225 10:38:02.169238 1 lownodeutilization.go:115] "Criteria for a node above target utilization" CPU=50 Mem=50 Pods=50
I0225 10:38:02.169245 1 lownodeutilization.go:116] "Number of overutilized nodes" totalNumber=3
I0225 10:38:02.169254 1 lownodeutilization.go:119] "No node is underutilized, nothing to do here, you might tune your thresholds further"
I0225 10:38:02.169266 1 duplicates.go:110] "Processing node" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169369 1 duplicates.go:110] "Processing node" node="aks-default-77236376-vmss000001"
I0225 10:38:02.169478 1 duplicates.go:110] "Processing node" node="aks-default-77236376-vmss000002"
I0225 10:38:02.169549 1 pod_antiaffinity.go:92] "Processing node" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169695 1 pod_antiaffinity.go:92] "Processing node" node="aks-default-77236376-vmss000001"
I0225 10:38:02.169827 1 pod_antiaffinity.go:92] "Processing node" node="aks-default-77236376-vmss000002"
I0225 10:38:02.169911 1 node_affinity.go:86] "Executing for nodeAffinityType" nodeAffinity="requiredDuringSchedulingIgnoredDuringExecution"
I0225 10:38:02.169922 1 node_affinity.go:91] "Processing node" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169946 1 node.go:169] "Pod fits on node" pod="uat/-5bc45656c-7nfpd" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169957 1 node.go:169] "Pod fits on node" pod="uat/-5dd64cc47f-2j4dj" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169970 1 node.go:169] "Pod fits on node" pod="uat/-76b76d5b84-cw7n2" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169980 1 node.go:169] "Pod fits on node" pod="uat/-5b4976477f-tjl5w" node="aks-default-77236376-vmss000000"
I0225 10:38:02.169993 1 node.go:169] "Pod fits on node" pod="uat/-7df8bc57df-6zr7r" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170000 1 node.go:169] "Pod fits on node" pod="uat/-644cd554bc-ft65m" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170008 1 node.go:169] "Pod fits on node" pod="uat/-5d5bdbbd66-b9zzl" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170041 1 node.go:169] "Pod fits on node" pod="uat/-78669c5f76-7z7db" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170054 1 node.go:169] "Pod fits on node" pod="uat/-755d6d849d-zslp7" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170059 1 node.go:169] "Pod fits on node" pod="cert-manager/cert-manager-cainjector-7d55bf8f78-kzpjf" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170067 1 node_affinity.go:91] "Processing node" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170075 1 node.go:169] "Pod fits on node" pod="uat/-5d5bdbbd66-2tk78" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170082 1 node.go:169] "Pod fits on node" pod="uat/-5dd64cc47f-vmvn4" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170093 1 node.go:169] "Pod fits on node" pod="uat/-644cd554bc-bwfd8" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170101 1 node.go:169] "Pod fits on node" pod="cert-manager/cert-manager-6d6bb4f487-8r7l4" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170118 1 node.go:169] "Pod fits on node" pod="uat/-5bc45656c-dqd5l" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170123 1 node.go:169] "Pod fits on node" pod="uat/-7df8bc57df-vx6g9" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170134 1 node.go:169] "Pod fits on node" pod="uat/-76b76d5b84-wtwr7" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170140 1 node.go:169] "Pod fits on node" pod="cert-manager/cert-manager-webhook-577f77586f-9nj5b" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170145 1 node.go:169] "Pod fits on node" pod="prometheus/prometheus-kube-prometheus-operator-788dfbdccc-hd7mv" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170161 1 node.go:169] "Pod fits on node" pod="uat/-78669c5f76-jfv85" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170175 1 node.go:169] "Pod fits on node" pod="prometheus/alertmanager-prometheus-kube-prometheus-alertmanager-0" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170195 1 node.go:169] "Pod fits on node" pod="uat/-755d6d849d-fkmdj" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170200 1 node.go:169] "Pod fits on node" pod="prometheus/prometheus-kube-state-metrics-57c988498f-45b9s" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170205 1 node.go:169] "Pod fits on node" pod="uat/-5b4976477f-tss9m" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170209 1 node_affinity.go:91] "Processing node" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170216 1 node.go:169] "Pod fits on node" pod="uat/-5d5bdbbd66-6jddv" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170221 1 node.go:169] "Pod fits on node" pod="uat/-5b4976477f-s5ths" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170226 1 node.go:169] "Pod fits on node" pod="uat/-644cd554bc-8d4lp" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170233 1 node.go:169] "Pod fits on node" pod="uat/-7df8bc57df-hjnw5" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170268 1 node.go:169] "Pod fits on node" pod="uat/-5bc45656c-xxrsf" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170273 1 node.go:169] "Pod fits on node" pod="uat/-76b76d5b84-6lprk" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170279 1 node.go:169] "Pod fits on node" pod="uat/-78669c5f76-mgd75" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170284 1 node.go:169] "Pod fits on node" pod="uat/-755d6d849d-4t97x" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170290 1 node.go:169] "Pod fits on node" pod="uat/-5dd64cc47f-n7nz5" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170296 1 node_taint.go:93] "Processing node" node="aks-default-77236376-vmss000000"
I0225 10:38:02.170351 1 node_taint.go:93] "Processing node" node="aks-default-77236376-vmss000001"
I0225 10:38:02.170394 1 node_taint.go:93] "Processing node" node="aks-default-77236376-vmss000002"
I0225 10:38:02.170456 1 descheduler.go:291] "Number of evicted pods" totalEvicted=0
I0225 10:38:02.170538 1 reflector.go:225] Stopping reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.170590 1 reflector.go:225] Stopping reflector *v1.Namespace (0s) from k8s.io/client-go/informers/factory.go:134
I0225 10:38:02.170618 1 reflector.go:225] Stopping reflector *v1.PriorityClass (0s) from k8s.io/client-go/informers/factory.go:134
panic: close of closed channel
goroutine 90 [running]:
sigs.k8s.io/descheduler/pkg/descheduler.Run.func1()
/go/src/sigs.k8s.io/descheduler/pkg/descheduler/descheduler.go:76 +0x3c
created by sigs.k8s.io/descheduler/pkg/descheduler.Run
/go/src/sigs.k8s.io/descheduler/pkg/descheduler/descheduler.go:74 +0x165
The text was updated successfully, but these errors were encountered:
This seems like a duplicate of #728 which was fixed by #731. We will publish a patch release with this fix soon, sorry for the inconvenience.
In the meantime, the workaround is to run the descheduler as a Deployment instead of a cronjob (if possible for your environment).
Closing this issue as a duplicate, please feel free to reopen if you have more to discuss regarding it. Thanks!
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
What version of descheduler are you using?
descheduler version: 0.23.1
Does this issue reproduce with the latest release?
Yes
Which descheduler CLI options are you using?
Defaults from Helm chart
Please provide a copy of your descheduler policy config file
Defaults from Helm chart:
What k8s version are you using (
kubectl version
)?kubectl version
OutputWhat did you do?
Installed the Helm chart without any custom values.
What did you expect to see?
Pods finisihing with state "Succeeded".
What did you see instead?
Mixture of "Succeeded" and "Failed" pods. Failing pods panic with:
Feels like something race-conditiony. Some Pods succeed, some fail. Full log below.
Full log (redacted private pod names)
The text was updated successfully, but these errors were encountered: