Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## [Unreleased]

### Added
- Declarative TOML-based output filter engine with 9 strategy types: `strip_noise`, `truncate`, `keep_matching`, `strip_annotated`, `test_summary`, `group_by_rule`, `git_status`, `git_diff`, `dedup`
- Embedded `default-filters.toml` with 19 pre-configured rules for CLI tools (cargo, git, docker, npm, pip, make, pytest, go, terraform, kubectl, brew, ls, journalctl)
- `filters_path` option in `FilterConfig` for user-provided filter rules override
- ReDoS protection: RegexBuilder with size_limit, 512-char pattern cap, 1 MiB file size limit
- Dedup strategy with configurable normalization patterns and HashMap pre-allocation
- NormalizeEntry replacement validation (rejects unescaped `$` capture group refs)

### Changed
- Migrated all 6 hardcoded filters (cargo_build, test_output, clippy, git, dir_listing, log_dedup) into the declarative TOML engine

### Removed
- `FilterConfig` per-filter config structs (`TestFilterConfig`, `GitFilterConfig`, `ClippyFilterConfig`, `CargoBuildFilterConfig`, `DirListingFilterConfig`, `LogDedupFilterConfig`) — filter params now in TOML strategy fields

## [0.11.4] - 2026-02-21

### Added
Expand Down
22 changes: 13 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,15 +133,19 @@ When two candidates score within a configurable threshold of each other, structu

### Smart Output Filtering — 70-99% Token Savings

Raw tool output is the #1 context window polluter. A `cargo test` run produces 300+ lines; the model needs 3. Zeph applies command-aware filters **before** context injection:

| Filter | What It Does | Typical Savings |
|--------|-------------|-----------------|
| **Test** | Cargo test/nextest — failures-only mode | 94-99% |
| **Git** | Compact status/diff/log/push | 80-99% |
| **Clippy** | Group warnings by lint rule | 70-90% |
| **Directory** | Hide noise dirs (target, node_modules, .git) | 60-80% |
| **Log dedup** | Normalize timestamps/UUIDs, count repeats | 70-85% |
Raw tool output is the #1 context window polluter. A `cargo test` run produces 300+ lines; the model needs 3. Zeph applies command-aware filters **before** context injection via a unified declarative TOML engine with 9 strategy types:

| Strategy | What It Does | Typical Savings |
|----------|-------------|-----------------|
| `test_summary` | Cargo test/nextest/pytest/Go test — failures-only mode | 94-99% |
| `git_status` / `git_diff` | Compact status and bounded diff/log output | 80-99% |
| `group_by_rule` | Group Clippy warnings by lint rule | 70-90% |
| `dedup` | Normalize timestamps/UUIDs, count repeats | 70-85% |
| `strip_noise` / `keep_matching` | Remove or retain lines by regex pattern | varies |
| `truncate` | Head+tail window with configurable limits | varies |
| `strip_annotated` | Drop annotated diagnostic lines (e.g. `help:`) | varies |

19 built-in rules ship embedded, covering Cargo test/nextest, Clippy, git, directory listings, Docker, npm/yarn/pnpm, pip, Make, pytest, Go test, Terraform, kubectl, and Homebrew. Drop a custom `filters.toml` next to your config to add or override rules without code changes.

Per-command stats shown inline, so you see exactly what was saved:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,28 +83,6 @@ destination = "stdout"
[tools.filters]
enabled = true

[tools.filters.test]
enabled = true
max_failures = 10
truncate_stack_trace = 50

[tools.filters.git]
enabled = true
max_log_entries = 20
max_diff_lines = 500

[tools.filters.clippy]
enabled = true

[tools.filters.cargo_build]
enabled = true

[tools.filters.dir_listing]
enabled = true

[tools.filters.log_dedup]
enabled = true

[tools.filters.security]
enabled = true
extra_patterns = []
Expand Down
1 change: 1 addition & 0 deletions crates/zeph-tools/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ serde = { workspace = true, features = ["derive"] }
serde_json.workspace = true
thiserror.workspace = true
tokio = { workspace = true, features = ["fs", "io-util", "macros", "process", "rt", "sync", "time"] }
toml.workspace = true
tokio-util.workspace = true
tracing.workspace = true
url.workspace = true
Expand Down
2 changes: 1 addition & 1 deletion crates/zeph-tools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Defines the `ToolExecutor` trait for sandboxed tool invocation and ships concret
| `file` | File operation executor |
| `scrape` | Web scraping executor with SSRF protection (post-DNS private IP validation, pinned address client) |
| `composite` | `CompositeExecutor` — chains executors with middleware |
| `filter` | Output filtering pipeline |
| `filter` | Output filtering pipeline — unified declarative TOML engine with 9 strategy types (`strip_noise`, `truncate`, `keep_matching`, `strip_annotated`, `test_summary`, `group_by_rule`, `git_status`, `git_diff`, `dedup`) and 19 embedded built-in rules; user-configurable via `filters.toml` |
| `permissions` | Permission checks for tool invocation |
| `audit` | `AuditLogger` — tool execution audit trail |
| `registry` | Tool registry and discovery |
Expand Down
289 changes: 0 additions & 289 deletions crates/zeph-tools/src/filter/cargo_build.rs

This file was deleted.

Loading
Loading