Skip to content

Comments

M21: Token optimization -- prompt caching, local summarization, context pruning#341

Merged
bug-ops merged 7 commits intomainfrom
feat/m21/token-optimization
Feb 15, 2026
Merged

M21: Token optimization -- prompt caching, local summarization, context pruning#341
bug-ops merged 7 commits intomainfrom
feat/m21/token-optimization

Conversation

@bug-ops
Copy link
Owner

@bug-ops bug-ops commented Feb 15, 2026

Summary

Estimated input token reduction: 80-90% from prompt caching alone.

Test plan

  • cargo nextest: 1408 tests passed
  • cargo clippy: zero warnings
  • Security audit: PASS (no new attack vectors)
  • Code review: APPROVE
  • Manual verification of cache hit rates via API usage dashboard

Closes #336, closes #337, closes #338, closes #339, closes #340

Split system prompt into structured content blocks with cache_control
markers. Add anthropic-beta: prompt-caching-2024-07-31 header to all
requests. Parse API usage response for cache hit tracking.

Closes #337
Route tool output summarization and context compaction through a
configurable local model (summary_model config option). Add inline
pruning of stale tool outputs in the tool loop to reduce context
growth per iteration.

Closes #338, closes #339
Add last_cache_usage() to LlmProvider trait for reading cache hit
statistics. ClaudeProvider stores cache_creation and cache_read token
counts from each response. Agent records these into MetricsSnapshot
after each LLM call.

Closes #340
@github-actions github-actions bot added documentation Improvements or additions to documentation llm LLM provider related rust core config size/L labels Feb 15, 2026
@codecov-commenter
Copy link

codecov-commenter commented Feb 15, 2026

Codecov Report

❌ Patch coverage is 70.29412% with 101 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/zeph-llm/src/claude.rs 72.54% 42 Missing ⚠️
src/main.rs 0.00% 19 Missing ⚠️
crates/zeph-tui/src/widgets/resources.rs 0.00% 11 Missing ⚠️
crates/zeph-core/src/agent/mod.rs 50.00% 8 Missing ⚠️
crates/zeph-llm/src/orchestrator/router.rs 0.00% 7 Missing ⚠️
crates/zeph-llm/src/orchestrator/mod.rs 0.00% 5 Missing ⚠️
crates/zeph-core/src/agent/context.rs 96.55% 4 Missing ⚠️
crates/zeph-llm/src/any.rs 0.00% 3 Missing ⚠️
crates/zeph-core/src/agent/streaming.rs 66.66% 2 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #341      +/-   ##
==========================================
- Coverage   79.35%   79.23%   -0.13%     
==========================================
  Files          99       99              
  Lines       25076    25406     +330     
==========================================
+ Hits        19900    20131     +231     
- Misses       5176     5275      +99     
Files with missing lines Coverage Δ
crates/zeph-core/src/config/types.rs 96.86% <100.00%> (+0.01%) ⬆️
crates/zeph-core/src/metrics.rs 100.00% <ø> (ø)
crates/zeph-llm/src/provider.rs 90.58% <100.00%> (+0.06%) ⬆️
crates/zeph-core/src/agent/streaming.rs 50.83% <66.66%> (+0.11%) ⬆️
crates/zeph-llm/src/any.rs 90.21% <0.00%> (-1.00%) ⬇️
crates/zeph-core/src/agent/context.rs 85.47% <96.55%> (+1.15%) ⬆️
crates/zeph-llm/src/orchestrator/mod.rs 90.99% <0.00%> (-0.90%) ⬇️
crates/zeph-llm/src/orchestrator/router.rs 54.46% <0.00%> (-3.64%) ⬇️
crates/zeph-core/src/agent/mod.rs 81.98% <50.00%> (-0.39%) ⬇️
crates/zeph-tui/src/widgets/resources.rs 0.00% <0.00%> (ø)
... and 2 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Show cache write/read token counts conditionally when the active
provider reports non-zero cache usage.
…zation

Ensure tool_use blocks are only emitted in assistant messages and
tool_result blocks only in user messages. Misplaced parts are
downgraded to text blocks to prevent Anthropic API 400 errors.
Orchestrator was missing last_cache_usage() delegation, causing cache
metrics to never reach MetricsSnapshot when using provider=orchestrator.
@github-actions github-actions bot added size/XL and removed size/L labels Feb 15, 2026
@bug-ops bug-ops merged commit 9f821f1 into main Feb 15, 2026
18 checks passed
@bug-ops bug-ops deleted the feat/m21/token-optimization branch February 15, 2026 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config core documentation Improvements or additions to documentation llm LLM provider related rust size/XL

Projects

None yet

2 participants