Skip to content

Commit 30cda7d

Browse files
authored
Add guide for low-cost clusters (#1514)
1 parent bec6788 commit 30cda7d

File tree

2 files changed

+16
-0
lines changed

2 files changed

+16
-0
lines changed

Diff for: docs/guides/low-cost-clusters.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Low-cost clusters
2+
3+
_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_
4+
5+
Here are some tips for keeping costs down when running small clusters:
6+
7+
* Consider using [spot instances](../cluster-management/spot-instances.md).
8+
9+
* CPUs are cheaper than GPUs, so if there is low request volume and low latency is not critical, running on CPU instances will be more cost effective.
10+
11+
* If traffic is low and you have multiple models, you may be able to save cost by serving all of your models from a single API, rather than using a separate API per model. This can be especially useful for GPU-based models. See our guide for [multi-model endpoints](multi-model.md).
12+
13+
* If you need to have your cluster scale down to 0 API instances (the Cortex operator instance cannot be terminated), you must have `min_instances` set to 0 for your cluster, and no APIs can be running. Use `cortex get` to list your APIs, and `cortex delete <api_name>` to delete each one. After ~10 minutes, your cluster should scale down to 0 API instances.
14+
15+
* By default, Cortex performs rolling updates on APIs, which means that during an update, additional instances may be required. If downtime during an update is acceptable, you can disable rolling updates. See [here](../troubleshooting/stuck-updating.md#disabling-rolling-updates) for instructions.

Diff for: docs/summary.md

+1
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868
* [Multi-model endpoints](guides/multi-model.md)
6969
* [View API metrics](guides/metrics.md)
7070
* [Running in production](guides/production.md)
71+
* [Low-cost clusters](guides/low-cost-clusters.md)
7172
* [Set up a custom domain](guides/custom-domain.md)
7273
* [Set up VPC peering](guides/vpc-peering.md)
7374
* [SSH into worker instance](guides/ssh-instance.md)

0 commit comments

Comments
 (0)