Skip to content

Appropriately incorporating the AWS InstanceScheduler #945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wise-east opened this issue Apr 5, 2020 · 5 comments
Closed

Appropriately incorporating the AWS InstanceScheduler #945

wise-east opened this issue Apr 5, 2020 · 5 comments
Labels
question Further information is requested

Comments

@wise-east
Copy link

My API doesn't have to be running 24/7, or at least until it becomes more prevalently used globally, so I would like to use the AWS InstanceScheduler to automatically start and stop EC2 instances to further reduce costs for my API's deployment. I have my app working so that once its request to an endpoint with a GPU (the instance spun up with cortex) fails, it will make a request to VM with only a CPU, so I can let the GPU instances go down when there is less demand for my API.

The instructions that I picked up from this YouTube video to set up the InstanceScheduler with EC2 instances seem pretty straightforward, but I'm worried that setting it up will somehow mess up the API's endpoints once the instances are stopped and started again. This concern is based on my previous experience, where I saw that the EC2's Public DNS (IPv4) changed each time the instance was stopped and started again.

After spinning up the cluster, I see two instances in my EC2 instance dashboard, one for the operator node of the cluster and one for the spot node. I know that cortex is doing a lot of the heavy lifting for me in the back to configure different services, so I am afraid that just naively applying tags for the scheduler I set up with AWS InstanceScheduler to these EC2 instances will mess up my API deployment.

I don't think cortex has a way to automatically schedule spinning up/down clusters or particular EC2 instances in its cluster yet, so I think using this service is the only way to go. Although I am undertaking some tests of my own using cortex with the AWS InstanceScheduler, I would appreciate any guidance regarding correctly incorporating it with instances created with cortex.

@wise-east wise-east added the question Further information is requested label Apr 5, 2020
@deliahu
Copy link
Member

deliahu commented Apr 5, 2020

I understand your motivation for this since your traffic is low (at least for now, hopefully that changes soon :) ). We have #445 to track our progress on supporting this natively.

Until then, there are a few possibilities I can think of. I have not tried using the AWS InstanceScheduler before; I think the networking would actually be ok, since the request enters through the ELB (which won't go down with the instance), and kubernetes automatically maintains routing configurations as nodes come in and out. However, my guess is that if you terminate the instance, a replacement will be immediately created, since there will be an unscheduled API replica and the cluster autoscaler will notice that and request a new instance to schedule it on.

I think the best option would be to run cortex delete <api_name> and cortex deploy when desired (either on a schedule or based on other events/metrics). If you set min_instances in your cluster to 0, then when you cortex delete your API, the instance will spin down (after a short delay). When you then run cortex deploy it will request a new instance for the API again.

Another option is to programmatically modify the AWS autoscaling groups (for the workloads) to set max instances to 0, and then back to 1 (or more) as desired. The reason this isn't as good as the first option is that it's better for cortex to manage the autoscaling groups, otherwise some features like on-demand backup for spot instances may not work.

Let us know if you'd like any help setting this up!

@wise-east
Copy link
Author

wise-east commented Apr 5, 2020

However, my guess is that if you terminate the instance, a replacement will be immediately created, since there will be an unscheduled API replica and the cluster autoscaler will notice that and request a new instance to schedule it on.

I'm glad I asked and thank you for getting back to me so soon! I would've ended up wasting a lot of time, thank you.

I think the best option would be to run cortex delete <api_name> and cortex deploy when desired (either on a schedule or based on other events/metrics). If you set min_instances in your cluster to 0, then when you cortex delete your API, the instance will spin down (after a short delay). When you then run cortex deploy it will request a new instance for the API again.

This makes a lot of sense. I'd be looking forward to when this can be done natively as mentioned in #445.

I'm wondering if this option is also free from the networking issue that I am concerned about. If the API is deleted and then redeployed, wouldn't it be given a new url endpoint that I have to configure with the API Gateway service?

@wise-east
Copy link
Author

I've tried cortex delete and cortex deploy and was able to see that the endpoint remained the same, at least for the one time that I tried it. Is it guaranteed to be consistent if the config file, i.e. cortex.yaml file, remains the same?

@deliahu
Copy link
Member

deliahu commented Apr 5, 2020

@wise-east yes, as long as you haven't changed the name of your API and you haven't run cortex cluster down, the endpoint will remain the same

@wise-east
Copy link
Author

Great! Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants