-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native Support for Spot Termination #702
Comments
One issue that we have to be aware of is support for rebalance recommendations. CA customers can utilize Capacity Rebalance on their ASG with MNG or with NTH, but since Karpenter doesn't use ASG we have to implement a proactive replacement such that we only terminate an instance that receives a rebalance recommendation if there's an instance in another instance type/zone combo that has Spot capacity we are able to launch. |
I think the INT signal is easiest to handle, since we're going to lose the node very soon, we should just gracefully drain as soon as possible. For Rebalance, does it makes sense to think of this in the same way as defrag? I think the workflow is pretty similar:
|
It's largely the same as de-frag, with a caveat around not going back into the same pool. You could imagine looking at all the RBNed nodes and trying to binpack their pods and see if you could spin up nodes from pools other than a pool any of those nodes were from. |
If we're using something like CapacityOptimizedAllocationStrategy, we might be able to simply rely on fleet to not give us an instance in the same pool. In this sense, we can make karpenter unaware of the details and just let ec2 do the decision making. |
I think it will be easier for us to talk about this in a call, but that's not how the CO strategy works. You can maybe use CO prioritized and set that pool at a low priority, but otherwise there's no reason for it to avoid that pool. If all the pools you provide to the CO strategy are constrained, it will give you the least constrained pool, which may be the pool the instance that received the rebalance recommendation was in. |
are there any plans to support this? |
Yes, we are working on this functionality currently |
Looking forward to seeing this feature implemented. It will save tons of IPs for our cluster. |
I am assuming node termination handler logic been has been removed from karpentar. For graceful node shutdown when a user terminates a node in aws console or spot termination |
Is "spot termination" for this issue restricted to spot instances? Does this issue's scope include termination of non-spot instances (for underlying hardware maintenance by AWS, for example)? |
Yes, it does include support for AWS Health events as well, similar to the aws-node-termination-handler. |
Tell us about your request
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
Are you currently working around this issue?
How are you currently solving this problem?
Using aws-node-termination-handler up to this point (#105)
Additional context
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
Community Note
The text was updated successfully, but these errors were encountered: