-
Notifications
You must be signed in to change notification settings - Fork 607
Appropriately incorporating the AWS InstanceScheduler #945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I understand your motivation for this since your traffic is low (at least for now, hopefully that changes soon :) ). We have #445 to track our progress on supporting this natively. Until then, there are a few possibilities I can think of. I have not tried using the AWS InstanceScheduler before; I think the networking would actually be ok, since the request enters through the ELB (which won't go down with the instance), and kubernetes automatically maintains routing configurations as nodes come in and out. However, my guess is that if you terminate the instance, a replacement will be immediately created, since there will be an unscheduled API replica and the cluster autoscaler will notice that and request a new instance to schedule it on. I think the best option would be to run Another option is to programmatically modify the AWS autoscaling groups (for the workloads) to set max instances to 0, and then back to 1 (or more) as desired. The reason this isn't as good as the first option is that it's better for cortex to manage the autoscaling groups, otherwise some features like on-demand backup for spot instances may not work. Let us know if you'd like any help setting this up! |
I'm glad I asked and thank you for getting back to me so soon! I would've ended up wasting a lot of time, thank you.
This makes a lot of sense. I'd be looking forward to when this can be done natively as mentioned in #445. I'm wondering if this option is also free from the networking issue that I am concerned about. If the API is deleted and then redeployed, wouldn't it be given a new url endpoint that I have to configure with the API Gateway service? |
I've tried |
@wise-east yes, as long as you haven't changed the name of your API and you haven't run |
Great! Thank you |
My API doesn't have to be running 24/7, or at least until it becomes more prevalently used globally, so I would like to use the AWS InstanceScheduler to automatically start and stop EC2 instances to further reduce costs for my API's deployment. I have my app working so that once its request to an endpoint with a GPU (the instance spun up with
cortex
) fails, it will make a request to VM with only a CPU, so I can let the GPU instances go down when there is less demand for my API.The instructions that I picked up from this YouTube video to set up the InstanceScheduler with EC2 instances seem pretty straightforward, but I'm worried that setting it up will somehow mess up the API's endpoints once the instances are stopped and started again. This concern is based on my previous experience, where I saw that the EC2's Public DNS (IPv4) changed each time the instance was stopped and started again.
After spinning up the cluster, I see two instances in my EC2 instance dashboard, one for the operator node of the cluster and one for the spot node. I know that
cortex
is doing a lot of the heavy lifting for me in the back to configure different services, so I am afraid that just naively applying tags for the scheduler I set up with AWS InstanceScheduler to these EC2 instances will mess up my API deployment.I don't think
cortex
has a way to automatically schedule spinning up/down clusters or particular EC2 instances in its cluster yet, so I think using this service is the only way to go. Although I am undertaking some tests of my own usingcortex
with the AWS InstanceScheduler, I would appreciate any guidance regarding correctly incorporating it with instances created withcortex
.The text was updated successfully, but these errors were encountered: