bug-ops · bug-ops · Feb 14, 2026 · Feb 14, 2026 · Feb 14, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,17 +7,23 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ## [Unreleased]
 
 ### Added
+- `ToolRegistry` with typed `ToolDef` definitions for 7 built-in tools (bash, read, edit, write, glob, grep, web_scrape) (#239)
+- `FileExecutor` for sandboxed file operations: read, write, edit, glob, grep (#242)
+- `ToolCall` struct and `execute_tool_call()` on `ToolExecutor` trait for structured tool invocation (#241)
+- `CompositeExecutor` routes structured tool calls to correct sub-executor by tool_id (#243)
+- Tool catalog section in system prompt via `ToolRegistry::format_for_prompt()` (#244)
+- Configurable `max_tool_iterations` (default 10, previously hardcoded 3) via TOML and `ZEPH_AGENT_MAX_TOOL_ITERATIONS` env var (#245)
+- Doom-loop detection: breaks agent loop on 3 consecutive identical tool outputs
+- Context budget check at 80% threshold stops iteration before context overflow
 - `IndexWatcher` for incremental code index updates on file changes via `notify` file watcher (#233)
 - `watch` config field in `[index]` section (default `true`) to enable/disable file watching
+- Repo map cache with configurable TTL (`repo_map_ttl_secs`, default 300s) to avoid per-message filesystem traversal (#231)
+- Cross-session memory score threshold (`cross_session_score_threshold`, default 0.35) to filter low-relevance results (#232)
 
 ### Fixed
 - Persist `MessagePart` data to SQLite via `remember_with_parts()` — pruning state now survives session restarts (#229)
 - Clear tool output body from memory after Tier 1 pruning to reclaim heap (#230)
 
-### Added
-- Repo map cache with configurable TTL (`repo_map_ttl_secs`, default 300s) to avoid per-message filesystem traversal (#231)
-- Cross-session memory score threshold (`cross_session_score_threshold`, default 0.35) to filter low-relevance results (#232)
-
 ## [0.9.4] - 2026-02-14
 
 ### Added

diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -20,6 +20,7 @@ crossterm = "0.29"
 axum = "0.8"
 blake3 = "1.8"
 criterion = "0.8"
+glob = "0.3.3"
 futures = "0.3"
 ignore = "0.4"
 hf-hub = { version = "0.4", default-features = false, features = ["tokio", "rustls-tls", "ureq"] }
@@ -32,6 +33,7 @@ ollama-rs = { version = "0.3", default-features = false, features = ["rustls", "
 pulldown-cmark = "0.13"
 qdrant-client = { version = "1.16", default-features = false }
 ratatui = "0.30"
+regex = "1.12"
 reqwest = { version = "0.13", default-features = false }
 rmcp = "0.14"
 scrape-core = "0.2.2"

diff --git a/README.md b/README.md
@@ -17,11 +17,11 @@ Lightweight AI agent that routes tasks across **Ollama, Claude, OpenAI, and Hugg
 
 **Token-efficient by design.** Most agent frameworks inject every tool and instruction into every prompt. Zeph embeds skills and MCP tools as vectors, then selects only the top-K relevant ones per query via cosine similarity. Prompt size stays O(K) — not O(N) — regardless of how many capabilities are installed.
 
-**Intelligent context management.** Two-tier context pruning: Tier 1 selectively removes old tool outputs (clearing bodies from memory after persisting to SQLite) before falling back to Tier 2 LLM-based compaction, reducing unnecessary LLM calls. A token-based protection zone preserves recent context from pruning. Cross-session memory transfers knowledge between conversations with relevance filtering. Proportional budget allocation (8% summaries, 8% semantic recall, 4% cross-session, 30% code context, 50% recent history) keeps conversations efficient. Tool outputs are truncated at 30K chars with optional LLM-based summarization for large outputs. ZEPH.md project config discovery walks up the directory tree and injects project-specific context when available. Config hot-reload applies runtime-safe fields (timeouts, security, memory limits) on file change without restart.
+**Intelligent context management.** Two-tier context pruning: Tier 1 selectively removes old tool outputs (clearing bodies from memory after persisting to SQLite) before falling back to Tier 2 LLM-based compaction, reducing unnecessary LLM calls. A token-based protection zone preserves recent context from pruning. Cross-session memory transfers knowledge between conversations with relevance filtering. Proportional budget allocation (8% summaries, 8% semantic recall, 4% cross-session, 30% code context, 50% recent history) keeps conversations efficient. Tool outputs are truncated at 30K chars with optional LLM-based summarization for large outputs. Doom-loop detection breaks runaway tool cycles after 3 identical consecutive outputs, with configurable iteration limits (default 10). ZEPH.md project config discovery walks up the directory tree and injects project-specific context when available. Config hot-reload applies runtime-safe fields (timeouts, security, memory limits) on file change without restart.
 
 **Run anywhere.** Local models via Ollama or Candle (GGUF with Metal/CUDA), cloud APIs (Claude, OpenAI, GPT-compatible endpoints like Together AI and Groq), or all of them at once through the multi-model orchestrator with automatic fallback chains.
 
-**Production-ready security.** Shell sandboxing with path restrictions, command filtering (12 blocked patterns), destructive command confirmation, secret redaction, audit logging, SSRF protection, and Trivy-scanned container images with 0 HIGH/CRITICAL CVEs.
+**Production-ready security.** Shell sandboxing with path restrictions, command filtering (12 blocked patterns), destructive command confirmation, file operation sandbox with path traversal protection, secret redaction, audit logging, SSRF protection, and Trivy-scanned container images with 0 HIGH/CRITICAL CVEs.
 
 **Self-improving.** Skills evolve through failure detection, self-reflection, and LLM-generated improvements — with optional manual approval before activation.
 
@@ -99,7 +99,7 @@ cargo build --release --features tui
 | **Self-Learning** | Skills evolve via failure detection and LLM-generated improvements | [Self-Learning](https://bug-ops.github.io/zeph/guide/self-learning.html) |
 | **TUI Dashboard** | ratatui terminal UI with markdown rendering, deferred model warmup, scrollbar, mouse scroll, thinking blocks, conversation history, splash screen, live metrics, message queueing (max 10, FIFO with Ctrl+K clear) | [TUI](https://bug-ops.github.io/zeph/guide/tui.html) |
 | **Multi-Channel I/O** | CLI, Telegram, and TUI with streaming support | [Channels](https://bug-ops.github.io/zeph/guide/channels.html) |
-| **Defense-in-Depth** | Shell sandbox, command filter, secret redaction, audit log, SSRF protection | [Security](https://bug-ops.github.io/zeph/security.html) |
+| **Defense-in-Depth** | Shell sandbox, file sandbox with path traversal protection, command filter, secret redaction, audit log, SSRF protection, doom-loop detection | [Security](https://bug-ops.github.io/zeph/security.html) |
 
 ## Architecture
 
@@ -111,7 +111,7 @@ zeph (binary)
 ├── zeph-memory     — SQLite + Qdrant, semantic recall, summarization
 ├── zeph-index      — AST-based code indexing, semantic retrieval, repo map (optional)
 ├── zeph-channels   — Telegram adapter (teloxide) with streaming
-├── zeph-tools      — shell executor, web scraper, composite tool dispatch
+├── zeph-tools      — 7 built-in tools (shell, file, web scrape, fetch, grep, glob, think), tool registry, composite dispatch
 ├── zeph-mcp        — MCP client, multi-server lifecycle, unified tool matching
 ├── zeph-a2a        — A2A client + server, agent discovery, JSON-RPC 2.0
 └── zeph-tui        — ratatui TUI dashboard with live agent metrics (optional)

diff --git a/config/default.toml b/config/default.toml
@@ -1,6 +1,8 @@
 [agent]
 # Agent display name
 name = "Zeph"
+# Maximum tool execution iterations per user message (doom-loop protection)
+max_tool_iterations = 10
 
 [llm]
 # LLM provider: "ollama" for local models or "claude" for Claude API

diff --git a/crates/zeph-core/src/agent.rs b/crates/zeph-core/src/agent.rs
@@ -26,8 +26,7 @@ use crate::context::{ContextBudget, EnvironmentContext, build_system_prompt};
 use crate::redact::redact_secrets;
 use zeph_memory::semantic::estimate_tokens;
 
-// TODO(M14): Make configurable via AgentConfig (currently hardcoded for MVP)
-const MAX_SHELL_ITERATIONS: usize = 3;
+const DOOM_LOOP_WINDOW: usize = 3;
 const MAX_QUEUE_SIZE: usize = 10;
 const MESSAGE_MERGE_WINDOW: Duration = Duration::from_millis(500);
 const RECALL_PREFIX: &str = "[semantic recall]\n";
@@ -100,6 +99,8 @@ pub struct Agent<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor>
     #[cfg(feature = "index")]
     repo_map_ttl: std::time::Duration,
     warmup_ready: Option<watch::Receiver<bool>>,
+    max_tool_iterations: usize,
+    doom_loop_history: Vec<String>,
 }
 
 impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C, T> {
@@ -118,7 +119,7 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
             .filter_map(|m| registry.get_skill(&m.name).ok())
             .collect();
         let skills_prompt = format_skills_prompt(&all_skills, std::env::consts::OS);
-        let system_prompt = build_system_prompt(&skills_prompt, None);
+        let system_prompt = build_system_prompt(&skills_prompt, None, None);
         tracing::debug!(len = system_prompt.len(), "initial system prompt built");
         tracing::trace!(prompt = %system_prompt, "full system prompt");
 
@@ -182,9 +183,17 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
             #[cfg(feature = "index")]
             repo_map_ttl: std::time::Duration::from_secs(300),
             warmup_ready: None,
+            max_tool_iterations: 10,
+            doom_loop_history: Vec::new(),
         }
     }
 
+    #[must_use]
+    pub fn with_max_tool_iterations(mut self, max: usize) -> Self {
+        self.max_tool_iterations = max;
+        self
+    }
+
     #[must_use]
     pub fn with_memory(
         mut self,
@@ -1605,7 +1614,7 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
             .collect();
         let skills_prompt = format_skills_prompt(&all_skills, std::env::consts::OS);
         self.last_skills_prompt.clone_from(&skills_prompt);
-        let system_prompt = build_system_prompt(&skills_prompt, None);
+        let system_prompt = build_system_prompt(&skills_prompt, None, None);
         if let Some(msg) = self.messages.first_mut() {
             msg.content = system_prompt;
         }
@@ -1653,6 +1662,7 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
         tracing::info!("config reloaded");
     }
 
+    #[allow(clippy::too_many_lines)]
     async fn rebuild_system_prompt(&mut self, query: &str) {
         let all_meta = self.registry.all_meta();
         let matched_indices: Vec<usize> = if let Some(matcher) = &self.matcher {
@@ -1710,8 +1720,18 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
         let catalog_prompt = format_skills_catalog(&remaining_skills);
         self.last_skills_prompt.clone_from(&skills_prompt);
         let env = EnvironmentContext::gather(&self.model_name);
+        let tool_catalog = {
+            let defs = self.tool_executor.tool_definitions();
+            if defs.is_empty() {
+                None
+            } else {
+                let reg = zeph_tools::ToolRegistry::new();
+                Some(reg.format_for_prompt())
+            }
+        };
         #[allow(unused_mut)]
-        let mut system_prompt = build_system_prompt(&skills_prompt, Some(&env));
+        let mut system_prompt =
+            build_system_prompt(&skills_prompt, Some(&env), tool_catalog.as_deref());
 
         if !catalog_prompt.is_empty() {
             system_prompt.push_str("\n\n");
@@ -1832,9 +1852,33 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
     }
 
     async fn process_response(&mut self) -> anyhow::Result<()> {
-        for _ in 0..MAX_SHELL_ITERATIONS {
+        self.doom_loop_history.clear();
+
+        for iteration in 0..self.max_tool_iterations {
             self.channel.send_typing().await?;
 
+            // Context budget check at 80% threshold
+            if let Some(ref budget) = self.context_budget {
+                let used: usize = self
+                    .messages
+                    .iter()
+                    .map(|m| estimate_tokens(&m.content))
+                    .sum();
+                let threshold = budget.max_tokens() * 4 / 5;
+                if used >= threshold {
+                    tracing::warn!(
+                        iteration,
+                        used,
+                        threshold,
+                        "stopping tool loop: context budget nearing limit"
+                    );
+                    self.channel
+                        .send("Stopping: context window is nearly full.")
+                        .await?;
+                    break;
+                }
+            }
+
             let Some(response) = self.call_llm_with_timeout().await? else {
                 return Ok(());
             };
@@ -1869,6 +1913,25 @@ impl<P: LlmProvider + Clone + 'static, C: Channel, T: ToolExecutor> Agent<P, C,
             if !self.handle_tool_result(&response, result).await? {
                 return Ok(());
             }
+
+            // Doom-loop detection: compare last N outputs by string equality
+            if let Some(last_msg) = self.messages.last() {
+                self.doom_loop_history.push(last_msg.content.clone());
+                if self.doom_loop_history.len() >= DOOM_LOOP_WINDOW {
+                    let recent =
+                        &self.doom_loop_history[self.doom_loop_history.len() - DOOM_LOOP_WINDOW..];
+                    if recent.windows(2).all(|w| w[0] == w[1]) {
+                        tracing::warn!(
+                            iteration,
+                            "doom-loop detected: {DOOM_LOOP_WINDOW} consecutive identical outputs"
+                        );
+                        self.channel
+                            .send("Stopping: detected repeated identical tool outputs.")
+                            .await?;
+                        break;
+                    }
+                }
+            }
         }
 
         Ok(())
@@ -3382,7 +3445,7 @@ mod agent_tests {
             .iter()
             .filter(|m| m.role == Role::Assistant)
             .count();
-        assert!(assistant_count <= MAX_SHELL_ITERATIONS);
+        assert!(assistant_count <= 10);
     }
 
     #[test]
@@ -4560,4 +4623,32 @@ mod agent_tests {
         assert_eq!(filtered[0].summary_text, "high score");
         assert_eq!(filtered[1].summary_text, "at threshold");
     }
+
+    #[test]
+    fn doom_loop_detection_triggers_on_identical_outputs() {
+        let s = "same output".to_owned();
+        let history = vec![s.clone(), s.clone(), s];
+        let recent = &history[history.len() - DOOM_LOOP_WINDOW..];
+        assert!(recent.windows(2).all(|w| w[0] == w[1]));
+    }
+
+    #[test]
+    fn doom_loop_detection_no_trigger_on_different_outputs() {
+        let history = vec![
+            "output a".to_owned(),
+            "output b".to_owned(),
+            "output c".to_owned(),
+        ];
+        let recent = &history[history.len() - DOOM_LOOP_WINDOW..];
+        assert!(!recent.windows(2).all(|w| w[0] == w[1]));
+    }
+
+    #[test]
+    fn context_budget_80_percent_threshold() {
+        let budget = ContextBudget::new(1000, 0.20);
+        let threshold = budget.max_tokens() * 4 / 5;
+        assert_eq!(threshold, 800);
+        assert!(800 >= threshold); // at threshold → should stop
+        assert!(799 < threshold); // below threshold → should continue
+    }
 }
diff --git a/crates/zeph-core/src/config.rs b/crates/zeph-core/src/config.rs
@@ -32,9 +32,15 @@ pub struct Config {
     pub secrets: ResolvedSecrets,
 }
 
+fn default_max_tool_iterations() -> usize {
+    10
+}
+
 #[derive(Debug, Deserialize)]
 pub struct AgentConfig {
     pub name: String,
+    #[serde(default = "default_max_tool_iterations")]
+    pub max_tool_iterations: usize,
 }
 
 #[derive(Debug, Deserialize)]
@@ -864,6 +870,7 @@ impl Config {
         Self {
             agent: AgentConfig {
                 name: "Zeph".into(),
+                max_tool_iterations: 10,
             },
             llm: LlmConfig {
                 provider: "ollama".into(),

diff --git a/crates/zeph-core/src/context.rs b/crates/zeph-core/src/context.rs
@@ -37,14 +37,25 @@ the user explicitly asks about a skill by name.\n\
 - Do not execute commands that could cause data loss without confirmation.";
 
 #[must_use]
-pub fn build_system_prompt(skills_prompt: &str, env: Option<&EnvironmentContext>) -> String {
+pub fn build_system_prompt(
+    skills_prompt: &str,
+    env: Option<&EnvironmentContext>,
+    tool_catalog: Option<&str>,
+) -> String {
     let mut prompt = BASE_PROMPT.to_string();
 
     if let Some(env) = env {
         prompt.push_str("\n\n");
         prompt.push_str(&env.format());
     }
 
+    if let Some(catalog) = tool_catalog
+        && !catalog.is_empty()
+    {
+        prompt.push_str("\n\n");
+        prompt.push_str(catalog);
+    }
+
     if !skills_prompt.is_empty() {
         prompt.push_str("\n\n");
         prompt.push_str(skills_prompt);
@@ -187,14 +198,14 @@ mod tests {
 
     #[test]
     fn without_skills() {
-        let prompt = build_system_prompt("", None);
+        let prompt = build_system_prompt("", None, None);
         assert!(prompt.starts_with("You are Zeph"));
         assert!(!prompt.contains("available_skills"));
     }
 
     #[test]
     fn with_skills() {
-        let prompt = build_system_prompt("<available_skills>test</available_skills>", None);
+        let prompt = build_system_prompt("<available_skills>test</available_skills>", None, None);
         assert!(prompt.contains("You are Zeph"));
         assert!(prompt.contains("<available_skills>"));
     }
@@ -308,23 +319,23 @@ mod tests {
             os: "linux".into(),
             model_name: "test".into(),
         };
-        let prompt = build_system_prompt("skills here", Some(&env));
+        let prompt = build_system_prompt("skills here", Some(&env), None);
         assert!(prompt.contains("You are Zeph"));
         assert!(prompt.contains("<environment>"));
         assert!(prompt.contains("skills here"));
     }
 
     #[test]
     fn build_system_prompt_without_env() {
-        let prompt = build_system_prompt("skills here", None);
+        let prompt = build_system_prompt("skills here", None, None);
         assert!(prompt.contains("You are Zeph"));
         assert!(!prompt.contains("<environment>"));
         assert!(prompt.contains("skills here"));
     }
 
     #[test]
     fn base_prompt_contains_guidelines() {
-        let prompt = build_system_prompt("", None);
+        let prompt = build_system_prompt("", None, None);
         assert!(prompt.contains("## Tool Use"));
         assert!(prompt.contains("## Guidelines"));
         assert!(prompt.contains("## Security"));

diff --git a/crates/zeph-tools/Cargo.toml b/crates/zeph-tools/Cargo.toml
@@ -7,6 +7,8 @@ license.workspace = true
 repository.workspace = true
 
 [dependencies]
+glob.workspace = true
+regex.workspace = true
 reqwest = { workspace = true, features = ["rustls"] }
 scrape-core.workspace = true
 serde = { workspace = true, features = ["derive"] }