feat: enable prompt cache for anthropic #631

yingjiehe-xyz · 2025-01-16T18:57:45Z

Enable prompt cache for Anthropic following https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#how-prompt-caching-works.

Generally, we add "cache_control": {"type": "ephemeral"} into tool, system and message sections. The cache hit can be verified with the usage response, like

usage: {
cache_creation_input_tokens: 1479
cache_read_input_tokens: 0
input_tokens: 4
output_tokens: 78
}

And currently, “ephemeral” is the only supported cache type, which corresponds to this 5-minute lifetime.

Cost saving: From the pricing,

Cache write tokens are 25% more expensive than base input tokens
Cache read tokens are 90% cheaper than base input tokens

Assume we have N turns conversion, S denotes to system prompt length and the average new tokens(user inputs + outputs) length for each turn is M
Before cache: our estimated cost is around

S + (S + M) + (S + M * 2) + ... + (S + M * (N - 1)) = S * N + (N - 1) * N / 2 * M
After cache: our estimated cost is around
(S + M * (N - 1)) * 1.25 + (S + (S + M) + ... + (S + M * (N - 1))) * 0.1 = (S + M * (N - 1)) * 1.25 + (S * N + (N - 1) * N / 2 * M) * 0.1
To compare result 1 and result 2, we need to compare (S + M * (N - 1)) * 1.25 and (S * N + (N - 1) * N / 2 * M) * 0.9, if (S + M * (N - 1)) * 1.25 is greater, then cache is cost more and vice visa. Normally, our S is greater than 1000 and M is large as well, (S + M * (N - 1)) * 1.25 should be much smaller which means cache reduces our cost.
And some estimation from anthropic post:

Test with just run-ui and response verification:

github-actions · 2025-01-16T19:11:42Z

Desktop App for this PR

The following build is available for testing:

📱 macOS Desktop App (Universal, signed)

The app is signed and notarized for macOS. After downloading, unzip the file and drag the Goose.app to your Applications folder.

This link is provided by nightly.link and will work even if you're not logged into GitHub.

ahau-square

Can you do a quick back of the envelope estimation of cost and savings for turning on prompt caching based on the Anthropic pricing?

crates/goose/src/providers/anthropic.rs

yingjiehe-xyz · 2025-01-16T21:43:12Z

Can you do a quick back of the envelope estimation of cost and savings for turning on prompt caching based on the Anthropic pricing?

Sure, add one screenshot from anthropic post, also add my estimations

ahau-square · 2025-01-16T21:55:55Z

Can you do a quick back of the envelope estimation of cost and savings for turning on prompt caching based on the Anthropic pricing?

Sure, add one screenshot from anthropic post, also add my estimations

Awesome, looks like we can expect some big savings here!

michaelneale

this is really cool @yingjiehe-xyz and I think people will appreciate this a lot.

I wonder if similar exists for openrouter (as people like to use anthropic that way) but yes! very nice!

yingjiehe-xyz · 2025-01-16T23:18:08Z

this is really cool @yingjiehe-xyz and I think people will appreciate this a lot.

I wonder if similar exists for openrouter (as people like to use anthropic that way) but yes! very nice!

Yes, it is available in openrouter: https://openrouter.ai/docs/prompt-caching, I am planing on this for the next step

* v1.0: ci: remove bundle.py, and reference to it (#632) feat: enable prompt cache for anthropic (#631) feat: memory server (#601)

Yingjie He added 2 commits January 15, 2025 20:27

Enable prompt caching for Anthropic

0c6cab3

ad cache control to the last content

e415dd0

yingjiehe-xyz requested a review from ahau-square January 16, 2025 18:57

format code

e5323b0

yingjiehe-xyz requested a review from salman1993 January 16, 2025 19:03

yingjiehe-xyz changed the title ~~Yingjiehe/cache~~ feat: enable prompt cache for anthropic Jan 16, 2025

yingjiehe-xyz requested a review from baxen January 16, 2025 19:06

ahau-square reviewed Jan 16, 2025

View reviewed changes

crates/goose/src/providers/anthropic.rs Show resolved Hide resolved

yingjiehe-xyz requested review from ahau-square and laanak08 January 16, 2025 22:00

michaelneale approved these changes Jan 16, 2025

View reviewed changes

yingjiehe-xyz merged commit 3454855 into v1.0 Jan 16, 2025
6 checks passed

yingjiehe-xyz deleted the yingjiehe/cache branch January 16, 2025 23:29

michaelneale added a commit that referenced this pull request Jan 17, 2025

Merge branch 'v1.0' into micn/hermitize-mcp

7b0d99c

* v1.0: ci: remove bundle.py, and reference to it (#632) feat: enable prompt cache for anthropic (#631) feat: memory server (#601)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable prompt cache for anthropic #631

feat: enable prompt cache for anthropic #631

yingjiehe-xyz commented Jan 16, 2025 •

edited

Loading

github-actions bot commented Jan 16, 2025

ahau-square left a comment

yingjiehe-xyz commented Jan 16, 2025

ahau-square commented Jan 16, 2025

michaelneale left a comment

yingjiehe-xyz commented Jan 16, 2025

feat: enable prompt cache for anthropic #631

feat: enable prompt cache for anthropic #631

Conversation

yingjiehe-xyz commented Jan 16, 2025 • edited Loading

github-actions bot commented Jan 16, 2025

Desktop App for this PR

ahau-square left a comment

Choose a reason for hiding this comment

yingjiehe-xyz commented Jan 16, 2025

ahau-square commented Jan 16, 2025

michaelneale left a comment

Choose a reason for hiding this comment

yingjiehe-xyz commented Jan 16, 2025

yingjiehe-xyz commented Jan 16, 2025 •

edited

Loading