Skip to content

Add LLM response cache with SQLite backend #750

@bug-ops

Description

@bug-ops

Parent: #740 (P2)

Problem

Identical or near-identical queries re-invoke LLM, wasting tokens and latency.

Solution

  • SQLite-backed response cache keyed by content hash (SHA-256 of messages + model + params)
  • Configurable TTL (default: 1 hour)
  • Config: llm.response_cache.enabled (bool), llm.response_cache.ttl_secs (u64)
  • Periodic cleanup of expired entries

Affected crates

  • zeph-memory (cache storage)
  • zeph-llm (cache lookup/store around provider calls)

Acceptance criteria

  • Cache hit returns stored response without LLM call
  • TTL-based expiry
  • Cache bypass option per request
  • Periodic cleanup

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllmLLM provider relatedmemoryPersistence and memorysize/M

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions