-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Labels
enhancementNew feature or requestNew feature or requestllmLLM provider relatedLLM provider relatedmemoryPersistence and memoryPersistence and memorysize/M
Description
Parent: #740 (P2)
Problem
Identical or near-identical queries re-invoke LLM, wasting tokens and latency.
Solution
- SQLite-backed response cache keyed by content hash (SHA-256 of messages + model + params)
- Configurable TTL (default: 1 hour)
- Config:
llm.response_cache.enabled(bool),llm.response_cache.ttl_secs(u64) - Periodic cleanup of expired entries
Affected crates
zeph-memory(cache storage)zeph-llm(cache lookup/store around provider calls)
Acceptance criteria
- Cache hit returns stored response without LLM call
- TTL-based expiry
- Cache bypass option per request
- Periodic cleanup
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestllmLLM provider relatedLLM provider relatedmemoryPersistence and memoryPersistence and memorysize/M