feat: Add LiteLLM provider with automatic prompt caching support#3380
feat: Add LiteLLM provider with automatic prompt caching support#3380DOsinga merged 6 commits intoblock:mainfrom
Conversation
e734276 to
0f13770
Compare
| .model_name | ||
| .starts_with(OPENROUTER_MODEL_PREFIX_ANTHROPIC) | ||
| { | ||
| if provider.supports_cache_control() { |
| } | ||
|
|
||
| #[test] | ||
| #[serial_test::serial] |
|
@michaelneale @cgwalters May I request a review? |
| } | ||
|
|
||
| /// Check if this provider supports cache control | ||
| fn supports_cache_control(&self) -> bool { |
There was a problem hiding this comment.
I dont think we need this comment (I know we have this sort of thing going on everywhere but still)
There was a problem hiding this comment.
I included it to maintain consistency with the surrounding functions(ie supports_embeddings). This notation seems common in base.rs, but I agree with your thought.
There was a problem hiding this comment.
yeah it is, but I'd like to start pushing back on that. I think a lot of it might be LLM written code - the robots really like extra comments like this
|
|
||
| fn supports_cache_control(&self) -> bool { | ||
| if let Ok(models) = tokio::task::block_in_place(|| { | ||
| tokio::runtime::Handle::current().block_on(self.fetch_models()) |
There was a problem hiding this comment.
should we cache this? supports_cache_control sounds innocent enough I wouldn't expect to have it do a network call
|
if you can resolve the conflict we can land this! |
Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: HikaruEgashira <account@egahika.dev>
The test_model_config_context_limit_env_vars test was interfering with test_model_config_context_limits when run in parallel due to environment variable side effects. Adding serial_test::serial annotation ensures proper test isolation. Signed-off-by: HikaruEgashira <account@egahika.dev>
a4c85b5 to
535add4
Compare
|
✔️ Conflict resolved |
|
ugh, not sure what happened, but it is failing now with a type error. can you have a look? |
a85d53c to
3f0d51c
Compare
Signed-off-by: HikaruEgashira <account@egahika.dev>
3f0d51c to
adf1fa8
Compare
…ntral-deeplinks * origin/main: (22 commits) feat: deprecate jetbrains extension in favor of public one (#2589) feat: Add LiteLLM provider with automatic prompt caching support (#3380) docs: update desktop instructions for managing sessions (#3522) docs: update desktop instructions for session recipes (#3521) Replace mcp_core::content types with rmcp::model types (#3500) docs: update desktop instructions for tool perms (#3518) docs: update desktop instructions for tool router (#3519) Alexhancock/reapply 3491 (#3515) docs: update mcp install instructions for desktop (#3504) Docs: Access settings in new UI (#3514) feat: switch from mcp_core::Role to rmcp::model::Role (#3488) Revert "fix the output not being visible issue (#3491)" (#3511) fix: Load and Use recipes in new window (#3501) fix: working dir was not being set correctly (#3477) Fix launching session in new window (#3497) Fix tool call allow still showing initial state in chat after navigating back (#3498) feat: add rmcp as a workspace dep (#3483) feat: consolidate subagent execution for dynamic tasks (#3444) fix token alert indicator/popovers hiding and showing (#3492) Fix llm errors not propagating to the ui and auto summarize not starting (#3490) ...
…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev>
|
nice one @HikaruEgashira this seems the right way |
* main: Extension Library Improvements (#3541) fix(ui): enable selection of zero-config providers in desktop GUI (#3378) refactor: Renames recipe route to recipes to be consistent (#3540) Blog: Orchestrating 6 Subagents to Build a Collaborative API Playground (#3528) Catch json errors a little better (#3437) Rust debug (#3510) refactor: Centralise deeplink encode and decode into server (#3489) feat: deprecate jetbrains extension in favor of public one (#2589) feat: Add LiteLLM provider with automatic prompt caching support (#3380) docs: update desktop instructions for managing sessions (#3522) docs: update desktop instructions for session recipes (#3521) Replace mcp_core::content types with rmcp::model types (#3500) docs: update desktop instructions for tool perms (#3518) docs: update desktop instructions for tool router (#3519) Alexhancock/reapply 3491 (#3515) docs: update mcp install instructions for desktop (#3504) Docs: Access settings in new UI (#3514) feat: switch from mcp_core::Role to rmcp::model::Role (#3488) Revert "fix the output not being visible issue (#3491)" (#3511) fix: Load and Use recipes in new window (#3501)
|
Hey, following this PR - very exciting! When will this be released? 🙏 |
|
we're struggling with some stability issues. but soon! |
…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev> Signed-off-by: Adam Tarantino <tarantino.adam@hey.com>
Summary
This PR adds a new LiteLLM provider with automatic prompt caching functionality for Anthropic models, addressing the cost optimization needs when using Claude models through LiteLLM services.
cache_controlmarkers to system messages, last 2 user messages, and last tool definitionTechnical Details
crates/goose/src/providers/openrouter.rs:128-199)Related to #3333 #3334
cache behaivior