-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RemovePodsViolatingNodeAffinity should not rely on node capacity when check if pod fits on current node #873
Comments
hi @seleznev thanks for raise this issue. |
@JaneLiuL I think @seleznev is actually reporting the opposite. It seems that RemovePodsViolatingNodeAffinity is evicting pods that do not violate the NodeAffinity solely because they don't fit on their current node (for whatever reason). The bug here is that pods should not be evicted just based on their requests in the current node. Pretty sure it's doing this because:
So what needs to happen is that the NodeFit implementation needs to somehow decouple the node selector check from the |
thanks @damemi , looks clear to me now, i would like to take add some tests on it. |
And there is the problem and cause for many issues. The same sort of baked-in NodeFit() call exists also in If those NodeFit() calls weren't hard-coded in these functions, meaning if that function would only be called when it is configured to be enabled, then issue #845 would be fixed and issue #863 would at least have a workaround by disabling NodeFit. Issue #640 probably requires a bit more, but it is very much related, and removal of the PodFitsAnyNode check and relying on the configurable boolean which controls the eviction step NodeFit() -call, is part of the offered PR. |
This is definitely a bug in https://github.com/kubernetes-sigs/descheduler/blob/master/pkg/descheduler/node/node.go#L122. Invoking if pod.Spec.Name != node.Name {
// Check if the pod can fit on a node based off it's requests
ok, reqErrors := fitsRequest(nodeIndexer, pod, node)
if !ok {
errors = append(errors, reqErrors...)
}
} There's no reason to check whether a pod fits resources of its assigned node. |
Given this issue is not about decoupling the NodeFit, I am dropping the generic approach comment into #640 (comment) instead to follow the discussion there. |
What version of descheduler are you using?
descheduler version: 0.24.1
Does this issue reproduce with the latest release?
Yes.
Which descheduler CLI options are you using?
Please provide a copy of your descheduler policy config file
What k8s version are you using (
kubectl version
)?kubectl version
OutputWhat did you do?
Steps to reproduce:
replicas: 1
and requests >50% of node's capacity (so, you can't schedule 2 pods on the same node).What did you expect to see?
Descheduler do nothing.
What did you see instead?
Descheduler evicts pod every iteration (because it fits on another node, but not on current one).
First iteration (I removed unrelevant messages about another pods and nodes):
Second:
The text was updated successfully, but these errors were encountered: