|  | 
| 22 | 22 |  * system → messages. | 
| 23 | 23 |  * | 
| 24 | 24 |  * @author Mark Pollack | 
|  | 25 | + * @author Soby Chacko | 
| 25 | 26 |  * @since 1.1.0 | 
| 26 | 27 |  */ | 
| 27 | 28 | public enum AnthropicCacheStrategy { | 
| 28 | 29 | 
 | 
| 29 | 30 | 	/** | 
| 30 |  | -	 * No caching (default behavior). | 
|  | 31 | +	 * No caching (default behavior). All content is processed fresh on each request. | 
|  | 32 | +	 * <p> | 
|  | 33 | +	 * Use this when: | 
|  | 34 | +	 * <ul> | 
|  | 35 | +	 * <li>Requests are one-off or highly variable</li> | 
|  | 36 | +	 * <li>Content doesn't meet minimum token requirements (1024+ tokens)</li> | 
|  | 37 | +	 * <li>You want to avoid caching overhead</li> | 
|  | 38 | +	 * </ul> | 
| 31 | 39 | 	 */ | 
| 32 | 40 | 	NONE, | 
| 33 | 41 | 
 | 
|  | 42 | +	/** | 
|  | 43 | +	 * Cache tool definitions only. Places a cache breakpoint on the last tool, while | 
|  | 44 | +	 * system messages and conversation history remain uncached and are processed fresh on | 
|  | 45 | +	 * each request. | 
|  | 46 | +	 * <p> | 
|  | 47 | +	 * Use this when: | 
|  | 48 | +	 * <ul> | 
|  | 49 | +	 * <li>Tool definitions are large and stable (5000+ tokens)</li> | 
|  | 50 | +	 * <li>System prompts change frequently or are small (<500 tokens)</li> | 
|  | 51 | +	 * <li>You want to share cached tools across different system contexts (e.g., | 
|  | 52 | +	 * multi-tenant applications, A/B testing system prompts)</li> | 
|  | 53 | +	 * <li>Tool definitions rarely change</li> | 
|  | 54 | +	 * </ul> | 
|  | 55 | +	 * <p> | 
|  | 56 | +	 * <strong>Important:</strong> Changing any tool definition will invalidate this cache | 
|  | 57 | +	 * entry. Due to Anthropic's cascade invalidation, tool changes will also invalidate | 
|  | 58 | +	 * any downstream cache breakpoints (system, messages) if used in combination with | 
|  | 59 | +	 * other strategies. | 
|  | 60 | +	 */ | 
|  | 61 | +	TOOLS_ONLY, | 
|  | 62 | + | 
| 34 | 63 | 	/** | 
| 35 | 64 | 	 * Cache system instructions only. Places a cache breakpoint on the system message | 
| 36 |  | -	 * content. | 
|  | 65 | +	 * content. Tools are cached implicitly via Anthropic's automatic ~20-block lookback | 
|  | 66 | +	 * mechanism (content before the cache breakpoint is included in the cache). | 
|  | 67 | +	 * <p> | 
|  | 68 | +	 * Use this when: | 
|  | 69 | +	 * <ul> | 
|  | 70 | +	 * <li>System prompts are large and stable (1024+ tokens)</li> | 
|  | 71 | +	 * <li>Tool definitions are relatively small (<20 tools)</li> | 
|  | 72 | +	 * <li>You want simple, single-breakpoint caching</li> | 
|  | 73 | +	 * </ul> | 
|  | 74 | +	 * <p> | 
|  | 75 | +	 * <strong>Note:</strong> Changing tools will invalidate the cache since tools are | 
|  | 76 | +	 * part of the cache prefix (they appear before system in the request hierarchy). | 
| 37 | 77 | 	 */ | 
| 38 | 78 | 	SYSTEM_ONLY, | 
| 39 | 79 | 
 | 
| 40 | 80 | 	/** | 
| 41 | 81 | 	 * Cache system instructions and tool definitions. Places cache breakpoints on the | 
| 42 |  | -	 * last tool and system message content. | 
|  | 82 | +	 * last tool (breakpoint 1) and system message content (breakpoint 2). | 
|  | 83 | +	 * <p> | 
|  | 84 | +	 * Use this when: | 
|  | 85 | +	 * <ul> | 
|  | 86 | +	 * <li>Both tools and system prompts are large and stable</li> | 
|  | 87 | +	 * <li>You have many tools (20+ tools, beyond the automatic lookback window)</li> | 
|  | 88 | +	 * <li>You want deterministic, explicit caching of both components</li> | 
|  | 89 | +	 * <li>System prompts may change independently of tools</li> | 
|  | 90 | +	 * </ul> | 
|  | 91 | +	 * <p> | 
|  | 92 | +	 * <strong>Behavior:</strong> | 
|  | 93 | +	 * <ul> | 
|  | 94 | +	 * <li>If only tools change: Both caches invalidated (tools + system)</li> | 
|  | 95 | +	 * <li>If only system changes: Tools cache remains valid, system cache | 
|  | 96 | +	 * invalidated</li> | 
|  | 97 | +	 * </ul> | 
|  | 98 | +	 * This allows efficient reuse of tool cache when only system prompts are updated. | 
| 43 | 99 | 	 */ | 
| 44 | 100 | 	SYSTEM_AND_TOOLS, | 
| 45 | 101 | 
 | 
| 46 | 102 | 	/** | 
| 47 | 103 | 	 * Cache the entire conversation history up to (but not including) the current user | 
| 48 |  | -	 * question. This is ideal for multi-turn conversations where you want to reuse the | 
| 49 |  | -	 * conversation context while asking new questions. | 
|  | 104 | +	 * question. Places a cache breakpoint on the last user message in the conversation | 
|  | 105 | +	 * history, enabling incremental caching as the conversation grows. | 
|  | 106 | +	 * <p> | 
|  | 107 | +	 * Use this when: | 
|  | 108 | +	 * <ul> | 
|  | 109 | +	 * <li>Building multi-turn conversational applications (chatbots, assistants)</li> | 
|  | 110 | +	 * <li>Conversation history is large and grows over time</li> | 
|  | 111 | +	 * <li>You want to reuse conversation context while asking new questions</li> | 
|  | 112 | +	 * <li>Using chat memory advisors or conversation persistence</li> | 
|  | 113 | +	 * </ul> | 
|  | 114 | +	 * <p> | 
|  | 115 | +	 * <strong>Behavior:</strong> Each turn builds on the previous cached prefix. The | 
|  | 116 | +	 * cache grows incrementally: Request 1 caches [Message1], Request 2 caches [Message1 | 
|  | 117 | +	 * + Message2], etc. This provides significant cost savings (90%+) and performance | 
|  | 118 | +	 * improvements for long conversations. | 
|  | 119 | +	 * <p> | 
|  | 120 | +	 * <strong>Important:</strong> Changing tools or system prompts will invalidate the | 
|  | 121 | +	 * entire conversation cache due to cascade invalidation. Tool and system stability is | 
|  | 122 | +	 * critical for this strategy. | 
| 50 | 123 | 	 */ | 
| 51 | 124 | 	CONVERSATION_HISTORY | 
| 52 | 125 | 
 | 
|  | 
0 commit comments