-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale to zero #9
Comments
Scale to zero is really important for small projects that don’t need 24/7 compute running, and especially contractor work. Besides that microservices that don’t need to be running all the time, and side projects where someone wants a full container and is willing to deal with cold starts (like lambda but not being constrained to API gateway or lambda utilities in the container). scale to 0 is the only thing that prevents me from using it. |
This could be used very well for light batch jobs as well if it could scale to 0 |
It scales to just $0.007 per GB/hour if the application is idle as far as I can tell, no vCPU cost. Or there is a PauseService API call to eliminate that cost too. If you had a batch job, you could call ResumeService at the start and PauseService at the end? |
Yeah but that’s still $5/month more than cloud run, and if I’m using such a heavily managed service I wouldn’t want to automate pausing and resuming myself (with batch it would be fine but for an api it would be very difficult) |
What should happen when a request is made when instances are zero? Currently (if service is paused) the root url gives a http status code 404. Do you want the end-point to wait a bit from responding for some time to give a chance to spawn an instance and respond? |
Yeah, basically a cold start similar to lambda. That’s how cloud run does it. |
I’ll also add it would be really good for many users to not be forced to scale to zero like cloud run does. Like if we had some field to set “minimum containers” or something similar to lambda provisioned concurrency, because there are some use cases where you never want a cold start. |
@danthegoodman1 you can configure the minimum "provisioned" containers which stay active (paying for mem only, not CPU) - except you can only set that to It would be nice if you could leave that at |
@danthegoodman1 You aren't forced to scale to zero in Cloud Run and can set minimum instances that even charge/run at the "idle rate": |
This only used to be available if you were using cloud run for anthos, didn’t realize they updated it, thanks. |
Yep just wanted to make sure we still kept that feature just in case! |
I would use scaling to zero for dev stack. Also if i want to show the new version of my app to my customer, he can then look at the app just when he want. |
Apart from smaller projects, scale to zero would be super useful for development workflows. Imagine many developers deploying code branches for testing. Right now, they would have to consciously deprovision the service when they are not working on it. With scale to zero, there would be no costs when they aren’t working (= not sending requests to their personal deployment). And cold start latency isn’t relevant in this scenario. |
I'm surprised this hasn't been mentioned before but a turnoff of GCP is that they don't have an "Amazon Aurora Serverless" equivalent to go along with Cloud Run. Scale to Zero App Runner + Amazon Aurora Serverless would be a dream. |
300 IQ right there, that’s an awesome idea. Also dynamodb would work too but aurora serverless (v2 Postgres plz) would be a real separator |
Seconding that 300 IQ statement. We need that! GCP Cloud Run is ahead. Don't make us use their product. |
This is also amazing for Slack bots btw. I'd move our team Slack bot here in an instant if we could scale to 0. |
@nelsonjchen @danthegoodman1 Why is it not yet possible to use App Runner + Aurora PostgreSQL Serverless? I am trying to deploy an API on app on App Runner, and the API is supposed to connect to the Aurora PostgreSQL Serverless endpoint, but I cant get it to connect (locally or on App Runner). Does this mean it doesnt work? |
@stephanoparaskeva make sure you’re in the same vpc, have proper security groups and routing tables, or in your case it sounds like enabled public access if you’re trying to test locally (although please don’t do in production) |
@stephanoparaskeva Aurora Serverless can only be accessed from within VPC via private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1 |
I don't think I said it wasn't possible. You likely have connectivity issues not related to this lack of scale to zero issue like @danthegoodman1 said. |
Ah ok so I should use a public DB for the time being.
Thanks for the swift response! |
Could you take this conversation elsewhere? This issue is about scale to zero. |
Just for clarification, but you can use Aurora Serverless v1 without VPC by using its Data API :) https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html |
I think there's some relevant discussion in aws/containers-roadmap#1017 on scale-to-zero for Fargate which is probably applicable here too. I made a more detailed post RE Fargate, but summing up quickly: a few seconds (or more maybe) of cold-start latency would be OK for me, and I can send a ping to the service to mitigate that prior to hitting it full-force. |
Just adding my 2 cents. Lots of backoffice-like and data applications could not care less about latency. Actually being able to scale to zero would be amazing for many use cases. This could be a non default configurable feature and the UI could say explicitly how scaling memory to zero affects latency. |
I use currently Heroku for this use-case - it scales to zero on free accounts. I have not tried it yet, but because Heroku runs on AWS it could have low latency to Aurora Serverless databases. |
I think that "what about the cold start" argument is unrelated to the topic, because you'll expect cold starts in any service providing autoscaling, that's just the way programs work (especially ones running in a virtual machine). Our company usecase: we want our applications to be serverless, but we don't really want to change our development model to lambda one. We want to build traditional microservices, which could handle more than one request at a time, but we also want our dev and staging environment cost less when it is not used. |
If AppRunner doesn't scale to 0, why does mine show 0 instances when there is no traffic? Also my minimum is set to 1. |
I'm unsubscribing from this issue since I don't really run anything on AWS at the moment professionally anymore. That said, I do want to leave a funny anecdote about one competition. fly.io doesn't have scale to zero containers. They still have the same problem as apprunner. They do have scale to zero ec2-like-machines though! You can have a scale to zero Minecraft server. Via AWS's own firecracker too! Opposite world! How bizarre! They ain't no big cloud, but it's just funny to me. Hopefully AWS might see to it they can catch up to organizations using their own products to make scale to zero products. |
Surprised to see no progress and no response. I've had better responses from core GCP Cloud team members on youtube threads. |
Subscribing, I also would be very interested in this to happen |
I really would like any scale to zero container service, that isn't lambda. I am willing to pay for from-cold delays but not repeated lambda delays. |
Would really like scale to zero. I have AppRunner instances for multiple environment (dev/staging/prod). Dev and Staging are usually only needed when I'm actively developing or testing the next release. It would be nice not to be charged for these "idle" instances. |
This would be a big deal coming from GCP Cloud Run! |
And another year goes by and not a single reply from the AWS team... insane. |
Hey AWS team, do you have any ETA ? it seems that feature is needed by everyone. |
2 years, 300+ reacts, and no response from AWS? Shocking. We may be forced to Google Cloud Run as it clearly has more investment in it. |
Ah $H!%. It's never a good sign to find to an open issue with 300+ upvotes. Guess I gotta switch my approach 😆 |
@alexanderwink can you provide a link to that page in the docs? |
Says so right on the pricing page https://aws.amazon.com/apprunner/pricing/ |
This is still ~5$/month per instance with 1GB of memory. Not that it's much but it adds up quickly, especially for apps that are used once a month or less. My real need here is a cold standby mode. |
Plus, in eu-central-1 the price is higher and becomes ~$6,4/GB. If you want 1 vCPU you're forced to reserve 2 GB RAM which means $13/month just for the idle app. |
|
Adding to what @gabrielboucher said: Summing prod and staging environments, my company runs 250+ Streamlit dashboards, APIs and data pipelines that are infrequently acessed (anything from a few times a day to one time monthly). Assuming our average container has 2vCPUs + 2GB of memory (and my math is correct), we are talking about ~ 2500 USD monthly just for provisioned workloads. Actual bill would be much higher as each running container would cost ~ 0.14 per hour. Nowadays we use Kubernetes + Knative + Karpenter with EC2 Spot Instances and the equivalent part of our AWS bill is significantly cheaper. |
This isn't meant to be the solution for what's being discussed in the last 10-ish updates in this thread but if your application usage is so sparse (e.g. one hit a month) have you considered using Lambda with the Web adapter? This blog post walks you through the why and the how. The TL/DR is that you would just need to add this entry to your Dockerfile to run your Python based application unmodified in Lambda:
I have used it here to demonstrate how to run the same unmodified container image in Fargate and Lambda. If this works for you basically have a scale to zero service that, at that level of usage, is going to be free given the generous Lambda free-tier (assuming you are not using Lambda already and you are ok with experiencing a cold start). Again I am not suggesting this is the solution for the requirement you are raising, but perhaps some of your use cases may be served using this approach. |
For the records, my understanding is that Streamlit leverages websockets connections and App Runner does not support it (yet). |
@mreferre Thanks for the links, that's some really cool stuff! We've tried Lambdas in the past but the timeout limits + developer experience were show stoppers for us. Might reconsider for the subset of our APIs that don't require long running jobs. |
I'm kinda confused with the pricing of App Runner. Idle instances are when I'm not getting any traffic. But when does it become idle? |
When there is no incoming traffic to the application there is ramp-down windows for aboutn 60 seconds. After that you will see that the active instances metric goes down to 0. At this point you only pay for provisioned memory. |
Got it thanks! |
400+ votes and this still isn't even being thought about.. Wow. I'd love to hear more from AWS on whether this will soon come. I'm guessing it wont because it won't directly help AWS really, they wont really gain anything from adding this |
Can we get an update on this please? Hoping this will be moved to "Researching" soon... @backnol-aws @snnles @jsheld @lazarben @scuw19 @amitgupta85 @akshayram-wolverine |
I don't know anything, but just thinking about it and reading the comments, and watching the re:Invent videos on how they built and designed this it might be that scaling to zero just isn't feasible in App Runner? They did go to the trouble to scale down to zero CPU ( but keep the memory ). The functionality they do provide that is vaguely similar ( pause / unpause ) takes about 40s for it to become active after it was paused for my very simple app which seems way too long for a cold start. It's frustrating when they don't give any feedback on things, but I'll be surprised if they are able to pivot to provide it scale to true zero ( although it would be great ). |
I like this people @efekarakus @iamhopaul123 working on aws copilot, I hope this brilliant people could work on aws app runner roadmap like hipaa and scale to zero |
I have a serious side-project habit - I often have dozens of side projects on the go at once.
As such, I really appreciate scale-to-zero services like Google Cloud Run and Vercel, where if my project isn't getting any traffic at all it costs me nothing (or just a few cents a month in storage costs) - then it spins up a server when a request comes in, with a cold-start delay of a few seconds before it starts serving traffic.
I would love it if App Runner could do this! It looks like at the moment you have to pay for a minimum of one running instance.
The text was updated successfully, but these errors were encountered: