Skip to content

Conversation

sobychacko
Copy link
Contributor

…en efficiency

  • Adds cache control support in AnthropicApi and AnthropicChatModel
  • Creates AnthropicCacheType enum with EPHEMERAL cache type
  • Extends AbstractMessage and UserMessage to support cache parameters
  • Updates Usage tracking to include cache-related token metrics (cacheCreationInputTokens, cacheReadInputTokens)
  • Adds integration test to verify prompt caching functionality and token usage patterns
  • Updates ContentBlock to support CacheControl parameter for low-level API usage
  • Adds comprehensive reference documentation with usage examples and best practices

This implementation follows Anthropic's prompt caching API which allows for more efficient token usage by caching frequently used prompts.

Original implementation provided by @Claudio-code (Claudio Silva Junior) See 15e5026

Fixes #1403

@markpollack
Copy link
Member

Why add this to usermessage/abstract message? I though a chatoption would be sufficient. I'm concerned about having functionality in the message hiearchy that isn't universal.

@sobychacko sobychacko force-pushed the gh-1403 branch 2 times, most recently from e57caa3 to 2c97c14 Compare August 28, 2025 02:30
…tOptions

- Add cacheControl field to AnthropicChatOptions with builder method
- Create AnthropicCacheType enum with EPHEMERAL type for type-safe cache creation
- Update AnthropicChatModel.createRequest() to apply cache control from options to user message ContentBlocks
- Extend ContentBlock record with cacheControl parameter and constructor for API compatibility
- Update Usage record to include cacheCreationInputTokens and cacheReadInputTokens fields
- Update StreamHelper to handle new Usage constructor with cache token parameters
- Add AnthropicApiIT.chatWithPromptCache() test for low-level API validation
- Add AnthropicChatModelIT.chatWithPromptCacheViaOptions() integration test
- Add comprehensive unit tests for AnthropicChatOptions cache control functionality
- Update documentation with cacheControl() method examples and usage patterns

Cache control is configured through AnthropicChatOptions rather than message classes
to maintain provider portability. The cache control gets applied during request creation
in AnthropicChatModel when building ContentBlocks for user messages.

Original implementation provided by @Claudio-code (Claudio Silva Junior)
See spring-projects@15e5026

Fixes spring-projects#1403

Signed-off-by: Soby Chacko <soby.chacko@broadcom.com>
@adase11
Copy link

adase11 commented Sep 1, 2025

According to their request validation, Anthropic's API only supports a maximum of 4 cache blocks. I got this error with more than 4: "A maximum of 4 blocks with cache_control may be provided. Found 5."

Also system messages can be cached as well by changing System from a String to a List<ContentBlock> and then applying the logic for cache control when building up the system variable for the ChatCompletionRequest. I'm happy to demonstrate what these changes would look like.

@adase11
Copy link

adase11 commented Sep 3, 2025

@sobychacko - I added a pull request to your branch that takes care of the above issue while also making the caching using AnthropicChatOptions more robust.

It allows fine-grained configuration of Anthropic Prompt Caching on outgoing chat requests.

I added:

  • Configure how many prompt segments (“cache blocks”) to cache. (Anthropic only allows max of 4 right now, user could choose less if they desire)
  • Control which message types are eligible for caching. (Allow for selecting based on use-case - for example my use case required SYSTEM and TOOL but not USER or ASSISTANT )
  • Choose cache type per message type (e.g., EPHEMERAL, EPHEMERAL_1H). This is supported by Anthropic now, also came in handy for my use case
  • Set global minimum length for caching and override it per message type. Allowing users to optimize the messages that are attempted to be cached - allowing them to attempt to optimize their usage of the max 4 cache blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add suport to anthropic prompt caching
3 participants