Skip to content

feat: Add LiteLLM provider with automatic prompt caching support#3380

Merged
DOsinga merged 6 commits intoblock:mainfrom
HikaruEgashira:litellm-provider
Jul 19, 2025
Merged

feat: Add LiteLLM provider with automatic prompt caching support#3380
DOsinga merged 6 commits intoblock:mainfrom
HikaruEgashira:litellm-provider

Conversation

@HikaruEgashira
Copy link
Contributor

@HikaruEgashira HikaruEgashira commented Jul 12, 2025

Summary

This PR adds a new LiteLLM provider with automatic prompt caching functionality for Anthropic models, addressing the cost optimization needs when using Claude models through LiteLLM services.

  • New LiteLLMProvider: OpenAI-compatible API provider with LiteLLM-specific optimizations
  • Automatic prompt caching: Applies cache_control markers to system messages, last 2 user messages, and last tool definition
  • Model fetching: Retrieves available models from LiteLLM endpoint with proper filtering

Technical Details

  • Architecture: Based on OpenAI provider patterns with LiteLLM-specific enhancements
  • cache_control implementation: Similar to OpenRouter provider (crates/goose/src/providers/openrouter.rs:128-199)

Related to #3333 #3334

extensions:
  developer:
    enabled: true
    name: developer
    timeout: 300
    type: builtin
GOOSE_MODEL: anthropic/claude-sonnet-4-20250514
GOOSE_MODE: auto
LITELLM_HOST: http://localhost:4000
GOOSE_PROVIDER: litellm
extensions:
  developer:
    enabled: true
    name: developer
    timeout: 300
    type: builtin
GOOSE_MODEL: gemini/gemini-2.5-pro
GOOSE_MODE: auto
LITELLM_HOST: http://localhost:4000
GOOSE_PROVIDER: litellm

cache behaivior

@HikaruEgashira HikaruEgashira force-pushed the litellm-provider branch 7 times, most recently from e734276 to 0f13770 Compare July 12, 2025 15:03
.model_name
.starts_with(OPENROUTER_MODEL_PREFIX_ANTHROPIC)
{
if provider.supports_cache_control() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix for consistency
ref #3334 (comment)

}

#[test]
#[serial_test::serial]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci failed after #3260

@HikaruEgashira
Copy link
Contributor Author

@michaelneale @cgwalters May I request a review?

Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this

}

/// Check if this provider supports cache control
fn supports_cache_control(&self) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we need this comment (I know we have this sort of thing going on everywhere but still)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included it to maintain consistency with the surrounding functions(ie supports_embeddings). This notation seems common in base.rs, but I agree with your thought.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it is, but I'd like to start pushing back on that. I think a lot of it might be LLM written code - the robots really like extra comments like this


fn supports_cache_control(&self) -> bool {
if let Ok(models) = tokio::task::block_in_place(|| {
tokio::runtime::Handle::current().block_on(self.fetch_models())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we cache this? supports_cache_control sounds innocent enough I wouldn't expect to have it do a network call

@DOsinga DOsinga self-assigned this Jul 16, 2025
@DOsinga
Copy link
Collaborator

DOsinga commented Jul 16, 2025

if you can resolve the conflict we can land this!

Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: HikaruEgashira <account@egahika.dev>
The test_model_config_context_limit_env_vars test was interfering with
test_model_config_context_limits when run in parallel due to environment
variable side effects. Adding serial_test::serial annotation ensures proper
test isolation.

Signed-off-by: HikaruEgashira <account@egahika.dev>
@HikaruEgashira
Copy link
Contributor Author

✔️ Conflict resolved

@DOsinga
Copy link
Collaborator

DOsinga commented Jul 18, 2025

ugh, not sure what happened, but it is failing now with a type error. can you have a look?

Signed-off-by: HikaruEgashira <account@egahika.dev>
@DOsinga DOsinga merged commit 78b30cc into block:main Jul 19, 2025
7 checks passed
jsibbison-square added a commit that referenced this pull request Jul 20, 2025
…ntral-deeplinks

* origin/main: (22 commits)
  feat: deprecate jetbrains extension in favor of public one (#2589)
  feat: Add LiteLLM provider with automatic prompt caching support (#3380)
  docs: update desktop instructions for managing sessions (#3522)
  docs: update desktop instructions for session recipes (#3521)
  Replace mcp_core::content types with rmcp::model types (#3500)
  docs: update desktop instructions for tool perms (#3518)
  docs: update desktop instructions for tool router (#3519)
  Alexhancock/reapply 3491 (#3515)
  docs: update mcp install instructions for desktop (#3504)
  Docs: Access settings in new UI (#3514)
  feat: switch from mcp_core::Role to rmcp::model::Role (#3488)
  Revert "fix the output not being visible issue (#3491)" (#3511)
  fix: Load and Use recipes in new window (#3501)
  fix: working dir was not being set correctly  (#3477)
  Fix launching session in new window (#3497)
  Fix tool call allow still showing initial state in chat after navigating back (#3498)
  feat: add rmcp as a workspace dep (#3483)
  feat: consolidate subagent execution for dynamic tasks (#3444)
  fix token alert indicator/popovers hiding and showing (#3492)
  Fix llm errors not propagating to the ui and auto summarize not starting (#3490)
  ...
cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025
…ck#3380)

Signed-off-by: HikaruEgashira <account@egahika.dev>
@michaelneale
Copy link
Collaborator

nice one @HikaruEgashira this seems the right way

michaelneale added a commit that referenced this pull request Jul 21, 2025
* main:
  Extension Library Improvements (#3541)
  fix(ui): enable selection of zero-config providers in desktop GUI (#3378)
  refactor: Renames recipe route to recipes to be consistent (#3540)
  Blog: Orchestrating 6 Subagents to Build a Collaborative API Playground (#3528)
  Catch json errors a little better (#3437)
  Rust debug (#3510)
  refactor: Centralise deeplink encode and decode into server (#3489)
  feat: deprecate jetbrains extension in favor of public one (#2589)
  feat: Add LiteLLM provider with automatic prompt caching support (#3380)
  docs: update desktop instructions for managing sessions (#3522)
  docs: update desktop instructions for session recipes (#3521)
  Replace mcp_core::content types with rmcp::model types (#3500)
  docs: update desktop instructions for tool perms (#3518)
  docs: update desktop instructions for tool router (#3519)
  Alexhancock/reapply 3491 (#3515)
  docs: update mcp install instructions for desktop (#3504)
  Docs: Access settings in new UI (#3514)
  feat: switch from mcp_core::Role to rmcp::model::Role (#3488)
  Revert "fix the output not being visible issue (#3491)" (#3511)
  fix: Load and Use recipes in new window (#3501)
@kylecesmat
Copy link

Hey, following this PR - very exciting! When will this be released? 🙏

@DOsinga
Copy link
Collaborator

DOsinga commented Jul 30, 2025

we're struggling with some stability issues. but soon!

atarantino pushed a commit to atarantino/goose that referenced this pull request Aug 5, 2025
…ck#3380)

Signed-off-by: HikaruEgashira <account@egahika.dev>
Signed-off-by: Adam Tarantino <tarantino.adam@hey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants