-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation prevents GKE nodepool to scale down #13984
Comments
@paoloyx I closed the operator issue as a dupe - and pull in the body. We can sort stuff out here and if this is really a conflicting config with the operator pdb then we can transfer the issue over to that repo |
This was added long ago (#4329) - so this might be a case were they aren't necessary when there is a Pod Disruption Budget. Can you elaborate on what you expect to see when a node is being drained? |
@dprotaso thanks for having managed the duplicate thing, I was not sure were I was expected to open the issue. For sure I'll try to elaborate but please bear in mind that I'm not so very expert about Knative jobs/services - that's to say what are the duties of A premise about used configurationWe're using
It probably seems a cumbersome configuration. Scale down is still preventedAs of now I assume that PDB configuration is correct and allows me to drain nodes, so nodepool can be scaled down by GKE autoscaler. The autoscaler
ConclusionI do not know if the Hope that I gave all the necessary details, let me know @dprotaso, thank you very much! |
Just wanted to add some other details about another cluster where we're experiencing the same "issue"...there are other pods outside of the ones I reported before, namely:
So, in general, it seems (or at least is my suspect) that all is linked to that Thanks, again, for looking at this one |
@dprotaso Hi, just to know if the provided input is enough or if you still need info on my side. I'll be more than glad to give them if there is something more specific that you need, thank you so much. |
The original intent is that the activator, autoscaler, webhook etc should always be running otherwise you'll have a disruption in traffic and creating knative service. It seems like these annotations were added prior to PDB being stable in Kubernetes. The use of
Based on what I've been reading [1,2] it seems like PDB is the better the way to control pod counts when there is a disruption (scaling down nodes) by safely reshuffling pods onto ready nodes. I think the Knative operator should be removing I'm inclined to remove [1] https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md |
Looks like Tekton removed the annotation as well - tektoncd/pipeline#4124 to unblock node scale down. But they removed their PodDisruptionBudgets - tektoncd/pipeline#3787 because it was preventing draining in non-HA configurations. |
Yes, I can understand why those annotations were introduced then
Exactly, the whole nodepool upgrade process eventually stops himself and there is the need to do a manual operation to rollout the stucked Knative component
Let's wait for some feedback, as you suggest. It's perfectly reasonable. So for example we could do
Could it work in your opinion with |
##EDIT## The "solution" explained below actually does not prevent cluster to tear down knative-workload anymore, so it is not probably a good one as could cause service disruption. I think that we need a proper PDB support as outlined in previous posts So, after a quick test about the usage of annotations in
The outcome is that
[1] Override replicas, labels and annotations |
Note that cluster-autoscaler has been supporting checking for PDBs when it evicts pods from the node (in the process of draining a node to scale it down) since 1.6.X version of Kubernetes (ref). Here's how cluster-autoscaler behaves:
Code refs: File: simulator/drain.go
095: pods, daemonSetPods, blockingPod, err = drain.GetPodsForDeletionOnNodeDrain(
096: pods,
097: pdbs,
098: deleteOptions.SkipNodesWithSystemPods,
099: deleteOptions.SkipNodesWithLocalStorage,
100: deleteOptions.SkipNodesWithCustomControllerPods,
101: listers,
102: int32(deleteOptions.MinReplicaCount),
103: timestamp)
104: ...
109: if pdbBlockingPod, err := checkPdbs(pods, pdbs); err != nil {
110: return []*apiv1.Pod{}, []*apiv1.Pod{}, pdbBlockingPod, err
111: } If you check File: utils/drain/drain.go
110: safeToEvict := hasSafeToEvictAnnotation(pod)
...
129: if !safeToEvict && !terminal {
...
145: if hasNotSafeToEvictAnnotation(pod) {
146: return []*apiv1.Pod{}, []*apiv1.Pod{}, &BlockingPod{Pod: pod, Reason: NotSafeToEvictAnnotation}, fmt.Errorf("pod annotated as not safe to evict present: %s", pod.Name)
147: }
148: } Here's definition for File: utils/drain/drain.go
315: // This checks if pod has PodSafeToEvictKey annotation
316: func hasSafeToEvictAnnotation(pod *apiv1.Pod) bool {
317: return pod.GetAnnotations()[PodSafeToEvictKey] == "true"
318: }
319:
320: // This checks if pod has PodSafeToEvictKey annotation set to false
321: func hasNotSafeToEvictAnnotation(pod *apiv1.Pod) bool {
322: return pod.GetAnnotations()[PodSafeToEvictKey] == "false"
323: } Links are based on d1740df93b76d99bf9302ad0c62978deb0ec1d5b which is the latest commit on
|
/assign @dprotaso I'm going to drop the annotation |
@dprotaso thank you so much. Just one last question, and that is because I'm not familiar with Knative releasing and backporting to older versions...in order to get this update, should we upgrade our Or is expected to have an (eventual) manual action to backport the "fix" to |
I'm inclined to backport these changes - but I might wait to include some other fixes. We only support the last two releases - so we would only backport to 1.9 & 1.10 |
I'll leave this issue open to track the backport |
Point releases are out - I'm going to close this out https://github.com/knative/serving/releases/tag/knative-v1.10.2 thanks for surfacing this problem @paoloyx |
Thanks to you @dprotaso for your work |
Describe the bug
It's not a bug but a support request.
The GKE autoscaler tells me that
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
annotation is preventing nodepool to be scaled-down. The annotation is found on these pods:Expected behavior
I'm running pods in multi-replica with a PDB configured with
minAvailability
- made possible by knative/operator#1125 - and I would expect node to be drained. For sure I'm doing it for theactivator
pod.Is the
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
a mandatory one? Is there a way to configure it another way so the nodepool can properly scale down?To Reproduce
N/A
Knative release version
Knative operator running in cluster at version
1.7.1
. Also Knative-serving is at version1.7.1
Additional context
GKE cluster is a standard one and running the 1.23.17-gke.1700 version of K8s
The text was updated successfully, but these errors were encountered: