Why is min_replicas 0 not possible?

We are trying to deploy a text generation API on AWS. We do not expect the API to receive a lot of traffic initially and hence we would like to save some costs. My idea was that `min_replicas` can be set to 0 which would not keep an instance idle in case the traffic on the API is none. As soon as a new request would come in cortex would spawn a new instance and shut it down once the traffic goes back to 0.

However, I noticed that setting `min_replicas` to 0 is invalid. Isn't the above use case a valid one for this? Also, is this a recent change? I vaguely(very) remember that this was possible to do in version `0.20`(Please correct me if I'm wrong) but it seems like it is not in `0.26`.

cc @deliahu I opened a new thread here because - 1) It's a different issue than the other thread , 2) Other users might benefit from the conversation here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why is min_replicas 0 not possible? #1775

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Why is min_replicas 0 not possible? #1775

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions