diff --git a/README.md b/README.md index ee569d52..de290c47 100644 --- a/README.md +++ b/README.md @@ -54,6 +54,11 @@ LLM providers like Anthropic and OpenAI cache prompts based on exact prefix matc **Best use case:** Providers that count usage in requests, such as Github Copilot and Google Antigravity, have no negative price impact. +**Best use cases:** + +- **Request-based billing** — Providers that count usage in requests, such as Github Copilot and Google Antigravity, have no negative price impact. +- **Uniform token pricing** — Providers that bill cached tokens at the same rate as regular input tokens, such as Cerebras, see pure savings with no cache-miss penalty. + **Claude Subscriptions:** Anthropic subscription users (who receive "free" caching) may experience faster limit depletion than hit-rate ratios suggest due to the higher relative cost of cache misses. See [Claude Cache Limits](https://she-llac.com/claude-limits) for details. ## Configuration