Scale to zero #9

simonw · 2021-05-19T01:59:18Z

I have a serious side-project habit - I often have dozens of side projects on the go at once.

As such, I really appreciate scale-to-zero services like Google Cloud Run and Vercel, where if my project isn't getting any traffic at all it costs me nothing (or just a few cents a month in storage costs) - then it spins up a server when a request comes in, with a cold-start delay of a few seconds before it starts serving traffic.

I would love it if App Runner could do this! It looks like at the moment you have to pay for a minimum of one running instance.

danthegoodman1 · 2021-05-19T15:25:34Z

Scale to zero is really important for small projects that don’t need 24/7 compute running, and especially contractor work. Besides that microservices that don’t need to be running all the time, and side projects where someone wants a full container and is willing to deal with cold starts (like lambda but not being constrained to API gateway or lambda utilities in the container).

scale to 0 is the only thing that prevents me from using it.

danthegoodman1 · 2021-05-19T15:26:19Z

This could be used very well for light batch jobs as well if it could scale to 0

timanderson · 2021-05-19T15:27:58Z

It scales to just $0.007 per GB/hour if the application is idle as far as I can tell, no vCPU cost. Or there is a PauseService API call to eliminate that cost too. If you had a batch job, you could call ResumeService at the start and PauseService at the end?

danthegoodman1 · 2021-05-19T15:32:19Z

Yeah but that’s still $5/month more than cloud run, and if I’m using such a heavily managed service I wouldn’t want to automate pausing and resuming myself (with batch it would be fine but for an api it would be very difficult)

Munawwar · 2021-05-19T16:42:16Z

What should happen when a request is made when instances are zero? Currently (if service is paused) the root url gives a http status code 404. Do you want the end-point to wait a bit from responding for some time to give a chance to spawn an instance and respond?

danthegoodman1 · 2021-05-19T16:56:20Z

What should happen when a request is made when instances are zero? Currently (if service is paused) the root url gives a http status code 404. Do you want the end-point to wait a bit from responding for some time to give a chance to spawn an instance and respond?

Yeah, basically a cold start similar to lambda. That’s how cloud run does it.

danthegoodman1 · 2021-05-19T16:57:41Z

I’ll also add it would be really good for many users to not be forced to scale to zero like cloud run does. Like if we had some field to set “minimum containers” or something similar to lambda provisioned concurrency, because there are some use cases where you never want a cold start.

mwarkentin · 2021-05-19T17:20:53Z

@danthegoodman1 you can configure the minimum "provisioned" containers which stay active (paying for mem only, not CPU) - except you can only set that to >= 1.

It would be nice if you could leave that at 1 if you wanted to remove cold start, or 0 if you wanted to optimize for costs and were ok with some latency when the first request comes in to the system after its scaled down.

nelsonjchen · 2021-05-19T17:59:01Z

@danthegoodman1 You aren't forced to scale to zero in Cloud Run and can set minimum instances that even charge/run at the "idle rate":

https://cloud.google.com/run/docs/configuring/min-instances

danthegoodman1 · 2021-05-19T18:01:04Z

@danthegoodman1 You aren't forced to scale to zero in Cloud Run and can set minimum instances that even charge/run at the "idle rate":

https://cloud.google.com/run/docs/configuring/min-instances

This only used to be available if you were using cloud run for anthos, didn’t realize they updated it, thanks.

danthegoodman1 · 2021-05-19T18:01:58Z

@danthegoodman1 you can configure the minimum "provisioned" containers which stay active (paying for mem only, not CPU) - except you can only set that to >= 1.

It would be nice if you could leave that at 1 if you wanted to remove cold start, or 0 if you wanted to optimize for costs and were ok with some latency when the first request comes in to the system after its scaled down.

Yep just wanted to make sure we still kept that feature just in case!

flibustenet · 2021-05-19T20:39:41Z

I would use scaling to zero for dev stack. Also if i want to show the new version of my app to my customer, he can then look at the app just when he want.

486 · 2021-05-19T23:51:03Z

Apart from smaller projects, scale to zero would be super useful for development workflows. Imagine many developers deploying code branches for testing. Right now, they would have to consciously deprovision the service when they are not working on it.

With scale to zero, there would be no costs when they aren’t working (= not sending requests to their personal deployment). And cold start latency isn’t relevant in this scenario.

nelsonjchen · 2021-05-20T00:03:34Z

I'm surprised this hasn't been mentioned before but a turnoff of GCP is that they don't have an "Amazon Aurora Serverless" equivalent to go along with Cloud Run.

Scale to Zero App Runner + Amazon Aurora Serverless would be a dream.

danthegoodman1 · 2021-05-20T00:05:27Z

I'm surprised this hasn't been mentioned before but a turnoff of GCP is that they don't have an "Amazon Aurora Serverless" equivalent to go along with Cloud Run.

Scale to Zero AppRunner + Amazon Aurora Serverless would be a dream.

300 IQ right there, that’s an awesome idea. Also dynamodb would work too but aurora serverless (v2 Postgres plz) would be a real separator

tomaszdudek7 · 2021-05-20T08:48:05Z

Seconding that 300 IQ statement. We need that! GCP Cloud Run is ahead. Don't make us use their product.

danthegoodman1 · 2021-05-20T15:28:59Z

This is also amazing for Slack bots btw. I'd move our team Slack bot here in an instant if we could scale to 0.

stephanoparaskeva · 2021-06-06T11:28:58Z

@nelsonjchen @danthegoodman1 Why is it not yet possible to use App Runner + Aurora PostgreSQL Serverless?

I am trying to deploy an API on app on App Runner, and the API is supposed to connect to the Aurora PostgreSQL Serverless endpoint, but I cant get it to connect (locally or on App Runner). Does this mean it doesnt work?

danthegoodman1 · 2021-06-06T11:31:31Z

@stephanoparaskeva make sure you’re in the same vpc, have proper security groups and routing tables, or in your case it sounds like enabled public access if you’re trying to test locally (although please don’t do in production)

pavelsource · 2021-06-06T11:33:29Z

@stephanoparaskeva Aurora Serverless can only be accessed from within VPC via private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1

nelsonjchen · 2021-06-06T11:35:40Z

@nelsonjchen @danthegoodman1 Why is it not yet possible to use App Runner + Aurora PostgreSQL Serverless?

I am trying to deploy an API on app on App Runner, and the API is supposed to connect to the Aurora PostgreSQL Serverless endpoint, but I cant get it to connect (locally or on App Runner). Does this mean it doesnt work?

I don't think I said it wasn't possible. You likely have connectivity issues not related to this lack of scale to zero issue like @danthegoodman1 said.

stephanoparaskeva · 2021-06-06T11:35:53Z

@stephanoparaskeva Aurora Serverless can only work within VPC with a private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1

Ah ok so I should use a public DB for the time being.

Once VPC is released -- if both App Runner and Aurora are in the same VPC, should it just connect via endpoint + user + password?
Also how does one connect to Aurora using a locally running version of their API (is this possible)?

Thanks for the swift response!

nelsonjchen · 2021-06-06T11:37:28Z

@stephanoparaskeva Aurora Serverless can only work within VPC with a private IP. AppRunner does not support VPC integration yet, however it is on the roadmap: #1

Ah ok so I should use a public DB for the time being.

Once VPC is released -- if both App Runner and Aurora are in the same VPC, should it just connect via endpoint + user + password?

Also how does one connect to Aurora using a locally running version of their API (is this possible)?

Thanks for the swift response!

Could you take this conversation elsewhere? This issue is about scale to zero.

toricls · 2021-06-06T11:37:33Z

Just for clarification, but you can use Aurora Serverless v1 without VPC by using its Data API :)

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html

JonMarbach · 2021-06-17T00:05:59Z

I think there's some relevant discussion in aws/containers-roadmap#1017 on scale-to-zero for Fargate which is probably applicable here too. I made a more detailed post RE Fargate, but summing up quickly: a few seconds (or more maybe) of cold-start latency would be OK for me, and I can send a ping to the service to mitigate that prior to hitting it full-force.

CarlosDomingues · 2022-01-14T18:36:57Z

Just adding my 2 cents. Lots of backoffice-like and data applications could not care less about latency. Actually being able to scale to zero would be amazing for many use cases. This could be a non default configurable feature and the UI could say explicitly how scaling memory to zero affects latency.

iBobik · 2022-01-31T19:25:06Z

I use currently Heroku for this use-case - it scales to zero on free accounts.

I have not tried it yet, but because Heroku runs on AWS it could have low latency to Aurora Serverless databases.

siviae · 2022-02-08T08:42:06Z

I think that "what about the cold start" argument is unrelated to the topic, because you'll expect cold starts in any service providing autoscaling, that's just the way programs work (especially ones running in a virtual machine).

Our company usecase: we want our applications to be serverless, but we don't really want to change our development model to lambda one. We want to build traditional microservices, which could handle more than one request at a time, but we also want our dev and staging environment cost less when it is not used.

phishy · 2022-03-22T20:10:27Z

If AppRunner doesn't scale to 0, why does mine show 0 instances when there is no traffic? Also my minimum is set to 1.

nelsonjchen · 2023-01-06T09:25:26Z

I'm unsubscribing from this issue since I don't really run anything on AWS at the moment professionally anymore. That said, I do want to leave a funny anecdote about one competition.

fly.io doesn't have scale to zero containers. They still have the same problem as apprunner. They do have scale to zero ec2-like-machines though! You can have a scale to zero Minecraft server. Via AWS's own firecracker too! Opposite world! How bizarre!

They ain't no big cloud, but it's just funny to me. Hopefully AWS might see to it they can catch up to organizations using their own products to make scale to zero products.

algoflows · 2023-01-06T12:23:45Z

Surprised to see no progress and no response. I've had better responses from core GCP Cloud team members on youtube threads.

iomarcovalente · 2023-02-03T23:21:21Z

Subscribing, I also would be very interested in this to happen

jdrphillips · 2023-02-08T14:45:46Z

I really would like any scale to zero container service, that isn't lambda. I am willing to pay for from-cold delays but not repeated lambda delays.

suzukieng · 2023-03-01T10:58:50Z

Would really like scale to zero. I have AppRunner instances for multiple environment (dev/staging/prod). Dev and Staging are usually only needed when I'm actively developing or testing the next release. It would be nice not to be charged for these "idle" instances.

ebg1223 · 2023-06-20T00:11:55Z

This would be a big deal coming from GCP Cloud Run!

algoflows · 2023-06-20T06:33:47Z

And another year goes by and not a single reply from the AWS team... insane.

atali · 2023-06-20T10:14:29Z

Hey AWS team, do you have any ETA ? it seems that feature is needed by everyone.

cade-coreschedule · 2023-06-25T02:36:38Z

2 years, 300+ reacts, and no response from AWS? Shocking. We may be forced to Google Cloud Run as it clearly has more investment in it.

tornikeo · 2023-08-17T16:46:49Z

Ah $H!%. It's never a good sign to find to an open issue with 300+ upvotes. Guess I gotta switch my approach 😆

alexanderwink · 2023-08-17T17:04:00Z

Just to let everyone know. When the app doesn't receive any new requests for a while the CPU will throttle down to close to zero. At that time you will only pay for the provisioned memory and not for CPU. This is equivalent with scaling down to zero with a warm standby. As soon as new request comes in it will throttle back up and you will start paying for CPU used as well as memory.

This is described in the documentation:

When your application is deployed, you pay for the memory provisioned in each container instance. Keeping your container instance's memory provisioned when your application is idle ensures it can deliver consistently low millisecond latency

When your application is processing requests, you switch from provisioned container instances to active container instances that consume both memory and compute resources. You pay for the compute and any additional memory counsumed in excess of the memory allocated by your provisioned container instances.

benkehoe · 2023-08-17T17:33:14Z

@alexanderwink can you provide a link to that page in the docs?

alexanderwink · 2023-08-17T17:35:16Z

Says so right on the pricing page https://aws.amazon.com/apprunner/pricing/

gabrielboucher · 2023-08-17T17:41:25Z

Just to let everyone know. When the app doesn't receive any new requests for a while the CPU will throttle down to close to zero. At that time you will only pay for the provisioned memory and not for CPU.

This is still ~5$/month per instance with 1GB of memory. Not that it's much but it adds up quickly, especially for apps that are used once a month or less.

My real need here is a cold standby mode.

matteocontrini · 2023-08-17T18:17:06Z

This is still ~5$/month per instance with 1GB of memory.

Plus, in eu-central-1 the price is higher and becomes ~$6,4/GB. If you want 1 vCPU you're forced to reserve 2 GB RAM which means $13/month just for the idle app.

iBobik · 2023-08-17T18:24:50Z

Would you use service what can start your server like this? Or it could not show a button, but start on every request. Then it could stop server after X minutes of inactivity. Could start/stop instance, app runner, container, ...

start.page.moqup.mov

If yes, write me an e-mail about your use-case. I consider creating it, but I does not want to work on it only for me. :-)

CarlosDomingues · 2023-08-21T21:25:47Z

Adding to what @gabrielboucher said:

Summing prod and staging environments, my company runs 250+ Streamlit dashboards, APIs and data pipelines that are infrequently acessed (anything from a few times a day to one time monthly).

Assuming our average container has 2vCPUs + 2GB of memory (and my math is correct), we are talking about ~ 2500 USD monthly just for provisioned workloads. Actual bill would be much higher as each running container would cost ~ 0.14 per hour.

Nowadays we use Kubernetes + Knative + Karpenter with EC2 Spot Instances and the equivalent part of our AWS bill is significantly cheaper.

mreferre · 2023-08-21T21:46:22Z

This isn't meant to be the solution for what's being discussed in the last 10-ish updates in this thread but if your application usage is so sparse (e.g. one hit a month) have you considered using Lambda with the Web adapter? This blog post walks you through the why and the how.

The TL/DR is that you would just need to add this entry to your Dockerfile to run your Python based application unmodified in Lambda:

COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.7.0 /lambda-adapter /opt/extensions/lambda-adapter

I have used it here to demonstrate how to run the same unmodified container image in Fargate and Lambda.

If this works for you basically have a scale to zero service that, at that level of usage, is going to be free given the generous Lambda free-tier (assuming you are not using Lambda already and you are ok with experiencing a cold start).

Again I am not suggesting this is the solution for the requirement you are raising, but perhaps some of your use cases may be served using this approach.

mreferre · 2023-08-21T21:49:14Z

Adding to what @gabrielboucher said:

Summing prod and staging environments, my company runs 250+ Streamlit dashboards, APIs and data pipelines that are infrequently acessed (anything from a few times a day to one time monthly).

Assuming our average container has 2vCPUs + 2GB of memory (and my math is correct), we are talking about ~ 2500 USD monthly just for provisioned workloads. Actual bill would be much higher as each running container would cost ~ 0.14 per hour.

For the records, my understanding is that Streamlit leverages websockets connections and App Runner does not support it (yet).

See here and here.

CarlosDomingues · 2023-08-22T21:48:22Z

@mreferre Thanks for the links, that's some really cool stuff!

We've tried Lambdas in the past but the timeout limits + developer experience were show stoppers for us. Might reconsider for the subset of our APIs that don't require long running jobs.

anandhu-renie · 2023-08-31T10:12:37Z

I'm kinda confused with the pricing of App Runner. Idle instances are when I'm not getting any traffic. But when does it become idle?

alexanderwink · 2023-08-31T11:05:39Z

I'm kinda confused with the pricing of App Runner. Idle instances are when I'm not getting any traffic. But when does it become idle?

When there is no incoming traffic to the application there is ramp-down windows for aboutn 60 seconds. After that you will see that the active instances metric goes down to 0. At this point you only pay for provisioned memory.

anandhu-renie · 2023-08-31T11:14:15Z

Got it thanks!

TreyWW · 2024-03-20T19:04:57Z

400+ votes and this still isn't even being thought about.. Wow. I'd love to hear more from AWS on whether this will soon come. I'm guessing it wont because it won't directly help AWS really, they wont really gain anything from adding this

wesmontgomery · 2024-06-24T15:39:40Z

Can we get an update on this please? Hoping this will be moved to "Researching" soon...

@backnol-aws @snnles @jsheld @lazarben @scuw19 @amitgupta85 @akshayram-wolverine

larryjkl · 2024-07-29T19:18:52Z

I don't know anything, but just thinking about it and reading the comments, and watching the re:Invent videos on how they built and designed this it might be that scaling to zero just isn't feasible in App Runner? They did go to the trouble to scale down to zero CPU ( but keep the memory ). The functionality they do provide that is vaguely similar ( pause / unpause ) takes about 40s for it to become active after it was paused for my very simple app which seems way too long for a cold start. It's frustrating when they don't give any feedback on things, but I'll be surprised if they are able to pivot to provide it scale to true zero ( although it would be great ).

masterbater · 2024-08-16T14:30:35Z

I like this people @efekarakus @iamhopaul123 working on aws copilot, I hope this brilliant people could work on aws app runner roadmap like hipaa and scale to zero

mwarkentin mentioned this issue May 19, 2021

Branch preview environments #20

Open

hariohmprasath mentioned this issue Nov 7, 2023

CORS related issues happening on random #220

Open

TreyWW mentioned this issue Jun 11, 2024

Is this project going to be worked on anymore? #234

Open

tlaverdure mentioned this issue Jun 14, 2024

[ECS] [Fargate]: Expose Firecracker MicroVM Snapshots to AWS Fargate Tasks aws/containers-roadmap#2373

Open

Scale to zero #9

Scale to zero #9

Comments

simonw commented May 19, 2021

danthegoodman1 commented May 19, 2021

danthegoodman1 commented May 19, 2021

timanderson commented May 19, 2021

danthegoodman1 commented May 19, 2021

Munawwar commented May 19, 2021

danthegoodman1 commented May 19, 2021

danthegoodman1 commented May 19, 2021

mwarkentin commented May 19, 2021

nelsonjchen commented May 19, 2021

danthegoodman1 commented May 19, 2021

danthegoodman1 commented May 19, 2021

flibustenet commented May 19, 2021

486 commented May 19, 2021

nelsonjchen commented May 20, 2021 • edited Loading

danthegoodman1 commented May 20, 2021

tomaszdudek7 commented May 20, 2021

danthegoodman1 commented May 20, 2021

stephanoparaskeva commented Jun 6, 2021

danthegoodman1 commented Jun 6, 2021

pavelsource commented Jun 6, 2021 • edited Loading

nelsonjchen commented Jun 6, 2021

stephanoparaskeva commented Jun 6, 2021

nelsonjchen commented Jun 6, 2021

toricls commented Jun 6, 2021

JonMarbach commented Jun 17, 2021

CarlosDomingues commented Jan 14, 2022 • edited Loading

iBobik commented Jan 31, 2022

siviae commented Feb 8, 2022 • edited Loading

phishy commented Mar 22, 2022 • edited Loading

nelsonjchen commented Jan 6, 2023

algoflows commented Jan 6, 2023

iomarcovalente commented Feb 3, 2023

jdrphillips commented Feb 8, 2023 • edited Loading

suzukieng commented Mar 1, 2023

ebg1223 commented Jun 20, 2023

algoflows commented Jun 20, 2023

atali commented Jun 20, 2023

cade-coreschedule commented Jun 25, 2023

tornikeo commented Aug 17, 2023

alexanderwink commented Aug 17, 2023 • edited Loading

benkehoe commented Aug 17, 2023

alexanderwink commented Aug 17, 2023

gabrielboucher commented Aug 17, 2023

matteocontrini commented Aug 17, 2023

iBobik commented Aug 17, 2023

CarlosDomingues commented Aug 21, 2023 • edited Loading

mreferre commented Aug 21, 2023

mreferre commented Aug 21, 2023 • edited Loading

CarlosDomingues commented Aug 22, 2023

anandhu-renie commented Aug 31, 2023

alexanderwink commented Aug 31, 2023

anandhu-renie commented Aug 31, 2023

TreyWW commented Mar 20, 2024

wesmontgomery commented Jun 24, 2024

larryjkl commented Jul 29, 2024

masterbater commented Aug 16, 2024

nelsonjchen commented May 20, 2021 •

edited

Loading

pavelsource commented Jun 6, 2021 •

edited

Loading

CarlosDomingues commented Jan 14, 2022 •

edited

Loading

siviae commented Feb 8, 2022 •

edited

Loading

phishy commented Mar 22, 2022 •

edited

Loading

jdrphillips commented Feb 8, 2023 •

edited

Loading

alexanderwink commented Aug 17, 2023 •

edited

Loading

CarlosDomingues commented Aug 21, 2023 •

edited

Loading

mreferre commented Aug 21, 2023 •

edited

Loading