Skip to content

M21-P1: Anthropic prompt caching for ClaudeProvider #337

@bug-ops

Description

@bug-ops

Parent: #336

Summary

Enable Anthropic prompt caching by sending structured system content blocks with cache_control markers. Expected to reduce input tokens by 80-90%.

Requirements

  1. Add anthropic-beta: prompt-caching-2024-07-31 header to all Claude API requests
  2. Convert system field from Option<&str> to Option<Vec<SystemContentBlock>> in request bodies
  3. Split system prompt into cacheable blocks:
    • Block 1 (cached): base prompt + active skills
    • Block 2 (cached): tool catalog + environment context
    • Block 3 (not cached): project configs, repo map, MCP prompt
  4. Inject section markers in rebuild_system_prompt for splitting
  5. Parse usage.cache_read_input_tokens from responses

Acceptance Criteria

  • All Claude requests include anthropic-beta header
  • System prompt sent as structured content blocks with cache_control
  • Cache hit rate visible in metrics/logs
  • Existing tests pass, no trait-level breaking changes

Files

  • crates/zeph-llm/src/claude.rs
  • crates/zeph-core/src/agent/context.rs

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllmLLM provider related

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions