-
Notifications
You must be signed in to change notification settings - Fork 9.3k
Description
Description
Summary
The session watchdog reminder logic causes excessive input token consumption by repeatedly wrapping queued user messages in <system-reminder> tags and inlining the full user text on every loop iteration.
When a user sends multiple messages while the agent is busy, the same user content is duplicated into the model input repeatedly, leading to token amplification and rapid context window exhaustion.
Impact
- Input token usage grows rapidly during long-running sessions.
- Context window is exhausted much sooner than expected.
- Inference costs increase significantly.
- Agent behavior can degrade due to repeated context bloat.
Root cause
In the session loop (watchdog/interrupt handling path), queued user messages (messages sent after lastFinished) are ephemerally rewritten:
- The user text is wrapped in
<system-reminder>XML tags. - The full user text is duplicated into the reminder payload.
- This happens repeatedly on each loop iteration when
step > 1.
This makes token growth effectively O(steps × message_length) in busy-loop scenarios.
Reproduction steps
- Start a session with any agent that performs a long-running task (tool execution, multi-step reasoning, etc.).
- While the agent is busy, send multiple user messages in short intervals.
- Observe that each loop iteration rewrites queued user messages with
<system-reminder>and repeats the user’s full text. - Monitor input token count / context size: it grows much faster than the actual conversation content.
Expected behavior
- User messages should remain intact (no rewriting/expanding user text).
- The system should remind the agent about queued messages without duplicating user content.
- Reminders should be deduplicated/throttled to avoid repeated injection during busy loops.
Suggested fix
- Stop modifying user message parts directly.
- Inject a concise system reminder (e.g., “There are X queued user messages…”) without inlining user text.
- Add deduplication and throttling (e.g., exponential backoff) to prevent reminder spam.
Related PR
PR: fix(session): optimize system reminder to reduce token usage
#11136
Plugins
No response
OpenCode version
No response
Steps to reproduce
No response
Screenshot and/or share link
No response
Operating System
No response
Terminal
No response