feat(memory): embedded vector search, token estimation, context scrubbing by bug-ops · Pull Request #751 · bug-ops/zeph

bug-ops · 2026-02-22T17:50:02Z

Summary

P0 improvements to memory and context management (epic #740):

Implement embedded vector search fallback via SQLite BLOB storage #741: SqliteVectorStore — zero-dependency embedded vector search backend using SQLite BLOB storage with Rust-side cosine similarity. Configurable via vector_backend: sqlite|qdrant
Improve token estimation accuracy for multibyte text #742: Token estimation changed from bytes/3 to chars/4 for accurate budget allocation on multi-byte text (Cyrillic, CJK)
Add credential scrubbing to LLM context pipeline #743: scrub_content() credential redaction in LLM context pipeline — reuses existing redact_secrets + sanitize_paths, applied after context assembly before model calls. Configurable via redact_credentials: bool

Closes #741, closes #742, closes #743

Test plan

2431/2431 unit tests pass (8 new tests added)
SqliteVectorStore: all VectorStore trait methods, must_not filter, integer filter, empty collection
estimate_tokens: ASCII, Cyrillic, CJK, empty, short, long
scrub_content: secrets-only, paths-only, combined, no-match passthrough
Verify vector_backend = "sqlite" works end-to-end with semantic memory
Verify redact_credentials = false disables scrubbing

…dd context scrubbing - Implement SqliteVectorStore as zero-dependency VectorStore backend storing f32 embeddings as BLOBs with Rust-side cosine similarity (#741) - Replace bytes/3 token estimation with chars/4 for better multi-byte text accuracy (#742) - Add scrub_content() credential redaction in LLM context pipeline before model calls (#743)

- Update memory concept, semantic memory guide, configuration reference - Add vector backend comparison (SQLite vs Qdrant) and setup instructions - Document credential scrubbing in security reference - Add token estimation section in architecture docs - Update root README with embedded vector search feature

- Add 17 tests covering prepare_context scrubbing, SqliteVectorStore edge cases, config round-trips, scrub_content proptests - Fix token_safety_margin: wire from config to RuntimeConfig, apply in should_compact as token count multiplier - Add SemanticMemory + SqliteVectorStore e2e round-trip test

Replace has_qdrant() which only checked if a client object existed with is_vector_store_connected() that performs an actual health check (Qdrant gRPC ping or SQLite query). Diagnostics now report true connection status instead of assuming connectivity from config. Add health_check() to VectorStore trait with implementations for QdrantOps, SqliteVectorStore, and InMemoryVectorStore. Retain has_vector_store() for cheap sync checks where connectivity is not required.

Replace hardcoded "Qdrant" labels in TUI with actual vector_backend name from config. When the backend is not connected, hide it from the status bar entirely. Memory panel shows "Vector: qdrant (connected)" or "Vector: sqlite (connected)" when healthy, "(offline)" when unreachable, and omits the line completely when no backend is configured. Builder no longer sets qdrant_available=true optimistically — the real health check in main.rs determines the actual status.

- Add SemanticMemory::with_sqlite_backend() constructor - Bootstrap now selects constructor based on vector_backend config - Add ZEPH_MEMORY_VECTOR_BACKEND env override (sqlite|qdrant) - SQLite backend now actually used when configured instead of always creating Qdrant client

github-actions bot added enhancement New feature or request size/XL documentation Improvements or additions to documentation memory Persistence and memory rust core dependencies and removed size/XL labels Feb 22, 2026

bug-ops added 2 commits February 22, 2026 18:53

github-actions bot added the size/XL label Feb 22, 2026

bug-ops added 3 commits February 22, 2026 19:21

bug-ops merged commit 884f0e1 into main Feb 22, 2026
23 checks passed

bug-ops deleted the epic/memory-context-improvements branch February 22, 2026 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(memory): embedded vector search, token estimation, context scrubbing#751

feat(memory): embedded vector search, token estimation, context scrubbing#751
bug-ops merged 6 commits intomainfrom
epic/memory-context-improvements

bug-ops commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

bug-ops commented Feb 22, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant