feat: Add LiteLLM provider with automatic prompt caching support by HikaruEgashira · Pull Request #3380 · block/goose

HikaruEgashira · 2025-07-12T12:32:00Z

Summary

This PR adds a new LiteLLM provider with automatic prompt caching functionality for Anthropic models, addressing the cost optimization needs when using Claude models through LiteLLM services.

New LiteLLMProvider: OpenAI-compatible API provider with LiteLLM-specific optimizations
Automatic prompt caching: Applies cache_control markers to system messages, last 2 user messages, and last tool definition
Model fetching: Retrieves available models from LiteLLM endpoint with proper filtering

Technical Details

Architecture: Based on OpenAI provider patterns with LiteLLM-specific enhancements
cache_control implementation: Similar to OpenRouter provider (crates/goose/src/providers/openrouter.rs:128-199)

Related to #3333 #3334

extensions:
  developer:
    enabled: true
    name: developer
    timeout: 300
    type: builtin
GOOSE_MODEL: anthropic/claude-sonnet-4-20250514
GOOSE_MODE: auto
LITELLM_HOST: http://localhost:4000
GOOSE_PROVIDER: litellm

extensions:
  developer:
    enabled: true
    name: developer
    timeout: 300
    type: builtin
GOOSE_MODEL: gemini/gemini-2.5-pro
GOOSE_MODE: auto
LITELLM_HOST: http://localhost:4000
GOOSE_PROVIDER: litellm

cache behaivior

HikaruEgashira · 2025-07-12T15:07:59Z

crates/goose/src/providers/openrouter.rs

-        .model_name
-        .starts_with(OPENROUTER_MODEL_PREFIX_ANTHROPIC)
-    {
+    if provider.supports_cache_control() {


fix for consistency
ref #3334 (comment)

HikaruEgashira · 2025-07-12T15:31:31Z

crates/goose/src/model.rs

    }

    #[test]
+    #[serial_test::serial]


ci failed after #3260

HikaruEgashira · 2025-07-12T15:32:25Z

@michaelneale @cgwalters May I request a review?

DOsinga

Thanks for doing this

DOsinga · 2025-07-16T08:50:20Z

crates/goose/src/providers/base.rs

    }

+    /// Check if this provider supports cache control
+    fn supports_cache_control(&self) -> bool {


I dont think we need this comment (I know we have this sort of thing going on everywhere but still)

I included it to maintain consistency with the surrounding functions(ie supports_embeddings). This notation seems common in base.rs, but I agree with your thought.

yeah it is, but I'd like to start pushing back on that. I think a lot of it might be LLM written code - the robots really like extra comments like this

DOsinga · 2025-07-16T09:00:30Z

crates/goose/src/providers/litellm.rs

+
+    fn supports_cache_control(&self) -> bool {
+        if let Ok(models) = tokio::task::block_in_place(|| {
+            tokio::runtime::Handle::current().block_on(self.fetch_models())


should we cache this? supports_cache_control sounds innocent enough I wouldn't expect to have it do a network call

DOsinga · 2025-07-16T16:00:13Z

if you can resolve the conflict we can land this!

Signed-off-by: HikaruEgashira <account@egahika.dev>

The test_model_config_context_limit_env_vars test was interfering with test_model_config_context_limits when run in parallel due to environment variable side effects. Adding serial_test::serial annotation ensures proper test isolation. Signed-off-by: HikaruEgashira <account@egahika.dev>

HikaruEgashira · 2025-07-16T17:10:46Z

✔️ Conflict resolved

DOsinga · 2025-07-18T10:23:41Z

ugh, not sure what happened, but it is failing now with a type error. can you have a look?

Signed-off-by: HikaruEgashira <account@egahika.dev>

…ntral-deeplinks * origin/main: (22 commits) feat: deprecate jetbrains extension in favor of public one (#2589) feat: Add LiteLLM provider with automatic prompt caching support (#3380) docs: update desktop instructions for managing sessions (#3522) docs: update desktop instructions for session recipes (#3521) Replace mcp_core::content types with rmcp::model types (#3500) docs: update desktop instructions for tool perms (#3518) docs: update desktop instructions for tool router (#3519) Alexhancock/reapply 3491 (#3515) docs: update mcp install instructions for desktop (#3504) Docs: Access settings in new UI (#3514) feat: switch from mcp_core::Role to rmcp::model::Role (#3488) Revert "fix the output not being visible issue (#3491)" (#3511) fix: Load and Use recipes in new window (#3501) fix: working dir was not being set correctly (#3477) Fix launching session in new window (#3497) Fix tool call allow still showing initial state in chat after navigating back (#3498) feat: add rmcp as a workspace dep (#3483) feat: consolidate subagent execution for dynamic tasks (#3444) fix token alert indicator/popovers hiding and showing (#3492) Fix llm errors not propagating to the ui and auto summarize not starting (#3490) ...

…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev>

michaelneale · 2025-07-21T00:50:39Z

nice one @HikaruEgashira this seems the right way

* main: Extension Library Improvements (#3541) fix(ui): enable selection of zero-config providers in desktop GUI (#3378) refactor: Renames recipe route to recipes to be consistent (#3540) Blog: Orchestrating 6 Subagents to Build a Collaborative API Playground (#3528) Catch json errors a little better (#3437) Rust debug (#3510) refactor: Centralise deeplink encode and decode into server (#3489) feat: deprecate jetbrains extension in favor of public one (#2589) feat: Add LiteLLM provider with automatic prompt caching support (#3380) docs: update desktop instructions for managing sessions (#3522) docs: update desktop instructions for session recipes (#3521) Replace mcp_core::content types with rmcp::model types (#3500) docs: update desktop instructions for tool perms (#3518) docs: update desktop instructions for tool router (#3519) Alexhancock/reapply 3491 (#3515) docs: update mcp install instructions for desktop (#3504) Docs: Access settings in new UI (#3514) feat: switch from mcp_core::Role to rmcp::model::Role (#3488) Revert "fix the output not being visible issue (#3491)" (#3511) fix: Load and Use recipes in new window (#3501)

kylecesmat · 2025-07-30T19:32:15Z

Hey, following this PR - very exciting! When will this be released? 🙏

DOsinga · 2025-07-30T19:42:46Z

we're struggling with some stability issues. but soon!

…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev> Signed-off-by: Adam Tarantino <tarantino.adam@hey.com>

HikaruEgashira force-pushed the litellm-provider branch 7 times, most recently from e734276 to 0f13770 Compare July 12, 2025 15:03

HikaruEgashira commented Jul 12, 2025

View reviewed changes

crates/goose/src/model.rs

}

#[test]

#[serial_test::serial]

Copy link

Contributor Author

HikaruEgashira Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci failed after #3260

DOsinga approved these changes Jul 16, 2025

View reviewed changes

DOsinga self-assigned this Jul 16, 2025

HikaruEgashira added 5 commits July 17, 2025 02:05

feat: create litellm provider

65f3fde

Signed-off-by: HikaruEgashira <account@egahika.dev>

feat: add cache_control info

8659d40

Signed-off-by: HikaruEgashira <account@egahika.dev>

feat: use supports_cache_control

2258308

Signed-off-by: HikaruEgashira <account@egahika.dev>

fix: lint

a6627b2

Signed-off-by: HikaruEgashira <account@egahika.dev>

HikaruEgashira force-pushed the litellm-provider branch from a4c85b5 to 535add4 Compare July 16, 2025 17:07

HikaruEgashira force-pushed the litellm-provider branch from a85d53c to 3f0d51c Compare July 18, 2025 14:19

fix: correct get_usage function call in litellm provider

adf1fa8

Signed-off-by: HikaruEgashira <account@egahika.dev>

HikaruEgashira force-pushed the litellm-provider branch from 3f0d51c to adf1fa8 Compare July 18, 2025 14:22

DOsinga merged commit 78b30cc into block:main Jul 19, 2025
7 checks passed

cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025

feat: Add LiteLLM provider with automatic prompt caching support (blo…

08a0364

…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev>

HikaruEgashira mentioned this pull request Jul 23, 2025

Add cache_control support to OpenAI provider for Claude models #3333

Closed

1 task

atarantino pushed a commit to atarantino/goose that referenced this pull request Aug 5, 2025

feat: Add LiteLLM provider with automatic prompt caching support (blo…

7fff703

…ck#3380) Signed-off-by: HikaruEgashira <account@egahika.dev> Signed-off-by: Adam Tarantino <tarantino.adam@hey.com>

Conversation

HikaruEgashira commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Technical Details

Uh oh!

HikaruEgashira Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

HikaruEgashira Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

HikaruEgashira commented Jul 12, 2025

Uh oh!

DOsinga left a comment

Choose a reason for hiding this comment

Uh oh!

DOsinga Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

HikaruEgashira Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

DOsinga Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

DOsinga Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

DOsinga commented Jul 16, 2025

Uh oh!

HikaruEgashira commented Jul 16, 2025

Uh oh!

DOsinga commented Jul 18, 2025

Uh oh!

Uh oh!

michaelneale commented Jul 21, 2025

Uh oh!

kylecesmat commented Jul 30, 2025

Uh oh!

DOsinga commented Jul 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HikaruEgashira commented Jul 12, 2025 •

edited

Loading