Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter should show Disrupting or Terminating through kubectl get nodes when it has tainted Nodes #1152

Open
jonathan-innis opened this issue Apr 2, 2024 · 9 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jonathan-innis
Copy link
Member

jonathan-innis commented Apr 2, 2024

Description

What problem are you trying to solve?

Currently, Kubernetes uses the node.kubernetes.io/unschedulable taint and the spec.unschedulable field on the node to mark that a node is cordoned and may be about to be drained for maintenance or removal. This is visible through the printer columns that you get when you call kubectl get nodes like the following

NAME                                                    STATUS                     ROLES    AGE    VERSION
ip-192-168-10-60.us-west-2.compute.internal             Ready                      <none>   2d4h   v1.28.5-eks-5e0fdde
ip-192-168-125-1.us-west-2.compute.internal             Ready                      <none>   2d4h   v1.28.5-eks-5e0fdde
ip-192-168-9-118.us-west-2.compute.internal             Ready,SchedulingDisabled   <none>   2d4h   v1.28.5-eks-5e0fdde

The code for this handling can be seen in the printer columns logic for kubectl here.

This is nice visibility for users when Kubernetes is using this specific field; however, nothing is surfaced when Karpenter adds its taint and is actively draining the node since Karpenter doesn't update the spec.unschedulable field that the printer relies on to add the SchedulingDisabled section to the node.

It would be a really nice UX if we could add something similar to SchedulingDisabled (perhaps something like Disrupting or Terminating) to the node so that users get visibility through the printer that Karpenter is acting on the node.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@jonathan-innis jonathan-innis added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 2, 2024
@jonathan-innis
Copy link
Member Author

jonathan-innis commented Apr 2, 2024

Also, as part of the alignment effort with Cluster Autoscaler, I imagine whatever change that we suggested should be made in upstream would also apply to Cluster Autoscaler. Perhaps aligning on the taint that we both want to use and proposing a way that that taint could have special logic built around it in the node printer columns is a change that we could try to get into upstream?

cc: @MaciekPytel @towca

@jonathan-innis
Copy link
Member Author

Also also, there was a discussion in the K8s Slack over whether Karpenter should be using the unschedulable field on the node to piggy-back on this logic or not. Practically, we have chosen to separate ourselves from the basic "cordon" handling for the following reasons:

  1. We want to know that Karpenter was the one that initiated the tainting so that we can recover when we taint a node and the process gets killed (either due to crashing or upgrading)
  2. We want to make sure that we can drain DaemonSets while draining the node if the user wants it (right now Kubernetes will add a tolerations to DaemonSets for the node.kubernetes.io/unschedulable taint by default)

@jonathan-innis jonathan-innis removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 2, 2024
@jonathan-innis jonathan-innis changed the title Karpenter Should Show through kubectl get nodes when it has tainted Nodes Karpenter should show Disrupting or Terminating through kubectl get nodes when it has tainted Nodes Apr 2, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 1, 2024
@simon-wessel
Copy link

I would like to note that other controllers are relying on the SchedulingDisabled-Taint to see if a node is shutting down or not.

The CloudNativePG operator for example has a PDB that disallows deleting the pod. If a node with a CNPG pod receives the SchedulingDisabled-Taint, the operator will start to migrate that pod itself. Since Karpenter does not use the taint, the node is stuck.

@simon-wessel
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 29, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 27, 2024
@simon-wessel
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 27, 2024
@mark-webster-catalyst
Copy link

mark-webster-catalyst commented Dec 10, 2024

I understand and support the reasons for using the Karpenter specific taint, but it does break other things that look for the standard unschedulable taint. Would it be feasible to add both? So Karpenter would know it initiated the cordon, but also other operators would be aware that the node has been cordoned and they should do what they need to do (failover, in the case of cloudnative-pg).

@jonathan-innis
Copy link
Member Author

the operator will start to migrate that pod itself. Since Karpenter does not use the taint, the node is stuck

I think what we have seen in general is that these other operators support watching on the taint that the autoscaler supports. EBS CSI driver had done something similar since it was also hooking into knowledge that Karpenter was deleting the NodeClaim.

In general, I'm a little weary of things hooking into taints that can be added with a kubectl cordon from a user anyways since this might be added as a manual action and then you are going to get automation that's going to flywheel off of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants