Skip to content

Recursive Logic Failure: Model-Induced Data Truncation via replace Tool Heuristics #18917

@schygge

Description

@schygge

What happened?

During the modification of a large (>500 lines), highly repetitive project ledger, the agent's attempt to use the replace tool
resulted in catastrophic data truncation. The model provided a old_string context that was logically unique to its immediate
reasoning but physically ambiguous within the file's global scope. When the tool failed to find an exact match (due to minor
whitespace or neighbor-line discrepancies), the model entered a "Survival Loop"—prioritizing the satisfaction of the tool's
parameters (by providing shorter, increasingly imprecise strings) over the preservation of the surrounding data. This resulted in
the unintentional deletion of several hundred lines of historical context.

gemini-conversation-1770912120189.json

What did you expect to happen?

I expected the agent to recognize the replace tool's fragility in large, repetitive files and pivot to a safe read_file +
write_file strategy. I also expected the tool to error out if the matching string was ambiguous, rather than allowing a partial
match to destroy context.

I expected the replace tool to either:

  1. Provide a "Fuzzy Match" warning if the string was nearly identical.
  2. Provide a "Multiple Matches" error to prevent ambiguous overwrites.
  3. I expected the model heuristic to recognize the file size as a "High-Risk Buffer" and pivot automatically to a read_file +
    write_file (Full Rewrite) strategy instead of using surgical string replacement.

Technical Abstraction:

  • The "Anchor-Drift" Problem: In long files with repetitive headers (e.g., Markdown logs), the model's internal representation of
    the "unique anchor" drifts from the physical reality of the file. The current replace tool lacks a "Global Uniqueness" check
    before execution.
  • Model Regression under Tool Failure: When a model faces repeated replacement failed errors, its logic tends to regress toward
    "Over-Simplification." It starts removing context from the old_string to force a match, which, if successful, can overwrite
    large, unrelated blocks of text.
  • Environmental Blindness: The model lacks a "Pre-Execution File Audit" (e.g., checking line counts before and after a replace) to
    detect and self-correct silent truncations.

Client information

  • CLI Version: 0.28.2
  • Git Commit: da5e47a
  • Session ID: f2ee0383-07cb-4e50-9780-d8f291998a50
  • Operating System: win32 v24.11.0
  • Sandbox Environment: no sandbox
  • Model Version: auto-gemini-3
  • Auth Type: oauth-personal
  • Memory Usage: 1.24 GB
  • Terminal Name: Unknown
  • Terminal Background: #0c0c0c
  • Kitty Keyboard Protocol: Unsupported

Login information

No response

Anything else we need to know?

This represents a fundamental failure of "Deterministic Inspection." The model's internal representation of the file drifted from
the physical reality, and its error-correction heuristic regressed into "Over-Simplification," which prioritize matching at any
cost over data safety. This is a system-level flaw in how LLMs interact with surgical file-editing tools in large-context
environments.

Metadata

Metadata

Labels

area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions