feat: add cache_control support for Claude models in OpenAI provider by HikaruEgashira · Pull Request #3334 · block/goose

HikaruEgashira · 2025-07-10T03:53:57Z

Adds conditional cache_control functionality to OpenAI provider when model name contains "claude", enabling prompt caching for LiteLLM and other OpenAI-compatible services.

Fixes #3333

michaelneale · 2025-07-10T05:04:16Z

@HikaruEgashira thanks for this - can you run cargo fmt/check clippy etc - also, is will this work specifically with openrouter - what provider are you testing it with that uses the openai provider in this case?

Signed-off-by: HikaruEgashira <account@egahika.dev>

michaelneale · 2025-07-10T07:55:31Z

/// Update the request when using anthropic model.
/// For anthropic model, we can enable prompt caching to save cost. Since openrouter is the OpenAI compatible
/// endpoint, we need to modify the open ai request to have anthropic cache control field.

Does openrouter not do this as it is offering the openai compatible api? that seem really odd if they aren't? why would the offer an openai like api?

michaelneale · 2025-07-10T07:56:10Z

any chance we can confirm before/after caching with openai to know it does need this header? (still seems odd to me)

HikaruEgashira · 2025-07-10T11:47:58Z

I tested this program with LiteLLM. This header does not work with pure OpenAI but most OpenAI Compatible will work. I provide the usage from the system log.

before (1.0.29)

\"model\": \"anthropic.claude-3-5-haiku-20241022-v1:0\",\n  \"object\": \"chat.completion\",\n  \"system_fingerprint\": null,\n  \"usage\": {\n    \"cache_creation_input_tokens\": 0,\n    \"cache_read_input_tokens\": 0,\n    \"completion_tokens\": 554,\n    \"completion_tokens_details\": null,\n    \"prompt_tokens\": 5902,\n    \"prompt_tokens_details\": {\n      \"audio_tokens\": null,\n      \"cached_tokens\": 0\n    },\n    \"total_tokens\": 6456\n  }\n}","input_tokens":"5902","output_tokens":"554","total_tokens":"6456"},"target":"goose::providers::utils","span":{"name":"complete"},"spans":[{"name":"complete"}]}

after

\"model\": \"anthropic.claude-3-5-haiku-20241022-v1:0\",\n  \"object\": \"chat.completion\",\n  \"system_fingerprint\": null,\n  \"usage\": {\n    \"cache_creation_input_tokens\": 0,\n    \"cache_read_input_tokens\": 7593,\n    \"completion_tokens\": 68,\n    \"completion_tokens_details\": null,\n    \"prompt_tokens\": 8025,\n    \"prompt_tokens_details\": {\n      \"audio_tokens\": null,\n      \"cached_tokens\": 7593\n    },\n    \"total_tokens\": 8093\n  }\n}","input_tokens":"8025","output_tokens":"68","total_tokens":"8093"},"target":"goose::providers::utils","span":{"name":"complete"},"spans":[{"name":"complete"}]}

https://docs.litellm.ai/docs/completion/prompt_caching

cgwalters · 2025-07-10T13:44:15Z

crates/goose/src/providers/openai.rs

+            create_request(&self.model, system, messages, tools, &ImageFormat::OpenAi)?;
+
+        // Add cache_control for claude models (LiteLLM and other OpenAI-compatible services)
+        if self.model.model_name.to_lowercase().contains("claude") {


Shouldn't we add a method to the Provider trait for this instead?

michaelneale · 2025-07-11T01:34:30Z

I think if liteLLM offers an openai api - that should include things like caching abstraction on it, that should be part of its job.

michaelneale

I think we should have a liteLLM provider specifically - we shouldn't be changing the openai provider for specific middleware routers which are lacking features (but if we have a liteLLM provider - that would be ideal - and leave the openai one as it is).

This could be done by cloning the openai provider, and adding in liteLLM specific code (and things like this) which is a better experience as well (as there likely will be other things like this which are liteLLM specific - as liteLLM is important enough I think to justify this)

HikaruEgashira · 2025-07-11T03:22:20Z

OK! I'll add new provider. Thanks for reviewing.

HikaruEgashira force-pushed the cache_control_openai branch 3 times, most recently from af5f819 to a513be5 Compare July 10, 2025 03:59

michaelneale self-assigned this Jul 10, 2025

michaelneale added p1 Priority 1 - High (supports roadmap) waiting labels Jul 10, 2025

feat: add cache_control support for Claude models in OpenAI provider

6dc4e7e

Signed-off-by: HikaruEgashira <account@egahika.dev>

HikaruEgashira force-pushed the cache_control_openai branch from a513be5 to 6dc4e7e Compare July 10, 2025 07:24

cgwalters reviewed Jul 10, 2025

View reviewed changes

michaelneale requested changes Jul 11, 2025

View reviewed changes

michaelneale added status: backlog and removed p1 Priority 1 - High (supports roadmap) labels Jul 11, 2025

HikaruEgashira closed this Jul 11, 2025

HikaruEgashira mentioned this pull request Jul 12, 2025

feat: Add LiteLLM provider with automatic prompt caching support #3380

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cache_control support for Claude models in OpenAI provider#3334

feat: add cache_control support for Claude models in OpenAI provider#3334
HikaruEgashira wants to merge 1 commit intoblock:mainfrom
HikaruEgashira:cache_control_openai

HikaruEgashira commented Jul 10, 2025 •

edited

Loading

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

HikaruEgashira commented Jul 10, 2025 •

edited

Loading

Uh oh!

cgwalters Jul 10, 2025

Uh oh!

michaelneale commented Jul 11, 2025

Uh oh!

michaelneale left a comment

Uh oh!

HikaruEgashira commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HikaruEgashira commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

michaelneale commented Jul 10, 2025

Uh oh!

HikaruEgashira commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgwalters Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

michaelneale commented Jul 11, 2025

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

HikaruEgashira commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HikaruEgashira commented Jul 10, 2025 •

edited

Loading

HikaruEgashira commented Jul 10, 2025 •

edited

Loading