feat: add cache_control support for Claude models in OpenAI provider#3334
feat: add cache_control support for Claude models in OpenAI provider#3334HikaruEgashira wants to merge 1 commit intoblock:mainfrom
Conversation
af5f819 to
a513be5
Compare
|
@HikaruEgashira thanks for this - can you run cargo fmt/check clippy etc - also, is will this work specifically with openrouter - what provider are you testing it with that uses the openai provider in this case? |
Signed-off-by: HikaruEgashira <account@egahika.dev>
a513be5 to
6dc4e7e
Compare
Does openrouter not do this as it is offering the openai compatible api? that seem really odd if they aren't? why would the offer an openai like api? |
|
any chance we can confirm before/after caching with openai to know it does need this header? (still seems odd to me) |
|
I tested this program with LiteLLM. This header does not work with pure OpenAI but most OpenAI Compatible will work. I provide the usage from the system log. before (1.0.29) after |
| create_request(&self.model, system, messages, tools, &ImageFormat::OpenAi)?; | ||
|
|
||
| // Add cache_control for claude models (LiteLLM and other OpenAI-compatible services) | ||
| if self.model.model_name.to_lowercase().contains("claude") { |
There was a problem hiding this comment.
Shouldn't we add a method to the Provider trait for this instead?
|
I think if liteLLM offers an openai api - that should include things like caching abstraction on it, that should be part of its job. |
michaelneale
left a comment
There was a problem hiding this comment.
I think we should have a liteLLM provider specifically - we shouldn't be changing the openai provider for specific middleware routers which are lacking features (but if we have a liteLLM provider - that would be ideal - and leave the openai one as it is).
This could be done by cloning the openai provider, and adding in liteLLM specific code (and things like this) which is a better experience as well (as there likely will be other things like this which are liteLLM specific - as liteLLM is important enough I think to justify this)
|
OK! I'll add new provider. Thanks for reviewing. |
Adds conditional cache_control functionality to OpenAI provider when model name contains "claude", enabling prompt caching for LiteLLM and other OpenAI-compatible services.
Fixes #3333