You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What problem are you trying to solve?
Here's the English translation:
The EKS Auto Repair feature has been created, but it is only supported for EKS node groups and not for Karpenter. In Auto Repair, when a node's status becomes problematic, it changes to Not Ready. Please make it possible to terminate these nodes at a specific time.
How important is this feature to you?
ometimes, when a node's disk encounters issues or when the kubelet malfunctions, and these problems don't automatically resolve, it can lead to significant problems with pod operations. This situation is considered very critical.
In EKS node groups, this issue is addressed through the auto scaling group mechanism, which terminates the problematic node and replaces it with a new one. However, this functionality is not currently supported in Karpenter nodepools, which makes it a critical concern.
To elaborate on this:
EKS Auto Repair: This feature is designed to maintain the health of the cluster by automatically addressing node issues.
Node Group vs. Karpenter: While EKS node groups have this auto-repair capability through Auto Scaling Groups (ASGs), Karpenter, which is an alternative node provisioning solution, currently lacks this feature.
Critical Impact: The absence of this feature in Karpenter can lead to prolonged downtime or degraded performance if a node becomes unhealthy, as there's no automatic mechanism to replace the problematic node.
Desired Solution: Implementing a similar auto-repair or node replacement mechanism for Karpenter nodepools would be beneficial. This could involve detecting nodes in a 'Not Ready' state and scheduling them for termination and replacement at a specified time, similar to how it works with EKS node groups.
Importance: This feature is crucial for maintaining the reliability and performance of Kubernetes clusters, especially in production environments where downtime can have significant impacts.
Is there any specific aspect of this issue you'd like to explore further, or do you have any questions about potential solutions or workarounds?
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
The text was updated successfully, but these errors were encountered:
I think what you are looking for is Node Repair in Karpenter and this has been implemented as part of this PR - kubernetes-sigs/karpenter#1793. It should be included in our next release.
Description
What problem are you trying to solve?
Here's the English translation:
The EKS Auto Repair feature has been created, but it is only supported for EKS node groups and not for Karpenter. In Auto Repair, when a node's status becomes problematic, it changes to Not Ready. Please make it possible to terminate these nodes at a specific time.
How important is this feature to you?
ometimes, when a node's disk encounters issues or when the kubelet malfunctions, and these problems don't automatically resolve, it can lead to significant problems with pod operations. This situation is considered very critical.
In EKS node groups, this issue is addressed through the auto scaling group mechanism, which terminates the problematic node and replaces it with a new one. However, this functionality is not currently supported in Karpenter nodepools, which makes it a critical concern.
To elaborate on this:
EKS Auto Repair: This feature is designed to maintain the health of the cluster by automatically addressing node issues.
Node Group vs. Karpenter: While EKS node groups have this auto-repair capability through Auto Scaling Groups (ASGs), Karpenter, which is an alternative node provisioning solution, currently lacks this feature.
Critical Impact: The absence of this feature in Karpenter can lead to prolonged downtime or degraded performance if a node becomes unhealthy, as there's no automatic mechanism to replace the problematic node.
Desired Solution: Implementing a similar auto-repair or node replacement mechanism for Karpenter nodepools would be beneficial. This could involve detecting nodes in a 'Not Ready' state and scheduling them for termination and replacement at a specified time, similar to how it works with EKS node groups.
Importance: This feature is crucial for maintaining the reliability and performance of Kubernetes clusters, especially in production environments where downtime can have significant impacts.
Is there any specific aspect of this issue you'd like to explore further, or do you have any questions about potential solutions or workarounds?
The text was updated successfully, but these errors were encountered: