fix: handle truncated tool calls that break conversation alternation#7424
fix: handle truncated tool calls that break conversation alternation#7424tlongwell-block merged 1 commit intomainfrom
Conversation
b3821bc to
1d8ab66
Compare
1d8ab66 to
b452098
Compare
|
Testing with my local goose |
|
/goose |
PR #7424 Review: fix: handle truncated tool calls that break conversation alternationSummary: This PR correctly fixes a conversation alternation issue where truncated tool calls (from max_tokens limits) create trailing assistant messages that crash newer providers like Claude 4.6. The fix adds a user message when tool parsing fails, maintaining proper alternation. The approach is sound and minimal. ✅ Highlights
🟢 Suggestions
Verification NotesTraced the data flow:
Mixed OK/ERR scenario verified: If there are 3 tool requests (OK, ERR, OK):
The consecutive user messages (from #1 response + #2 error) are handled by Review generated by goose |
|
this is good. we could recover from this though if we would do the pre-fill thing, yeah? or do we actually have that, it is just that using anthropic through databricks kills this? |
The agent should recover automatically after getting the error message and continue working with smaller tools calls without user intervention |
no, I understand. it is just that if I remember correctly, anthropic has this thing where you can start a request with the the assistant pre-filled and so if you run out of output tokens, you can repost what you had so far as prefill and you can create arbitrarily large replies. I wonder if that is already supported, just not through databricks |
…oviders * 'main' of github.com:block/goose: New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) # Conflicts: # ui/desktop/src/components/settings/dictation/DictationSettings.tsx
…xt-edit * origin/main: (35 commits) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) ...
* main: Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419)
* main: (171 commits) fix: TLDR CLI tab in Neighborhood MCP docs (#7461) fix(summon): restore skill supporting files and directory path in load output (#7457) Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) ...
* origin/main: (49 commits) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) docs: voice dictation updates (#7396) [docs] Add Excalidraw MCP App Tutorial (#7401) ... # Conflicts: # ui/desktop/src/components/McpApps/McpAppRenderer.tsx
Summary
When a model response hits
max_tokensmid-tool-call, the streaming handler creates atool_requestwith a parse error (truncated JSON). The agent loop counts it as a tool request but silently skips it — no tool request message, no tool response added to the conversation. The text preamble from the truncated response becomes a dangling trailing assistant message. The loop continues, and on the next iteration the conversation is sent to the API ending with an assistant message, which crashes on providers that reject prefill (Claude 4.6+, Snowflake, others).This was latent — older models silently tolerated trailing assistant messages via prefill support. Claude Opus 4.6 (Feb 5) removed prefill as a documented breaking change, exposing it. Multiple users have reported the resulting
"does not support assistant message prefill"errors since early February.The fix
When a tool call can't be parsed, append a plain user text message telling the model the call was likely truncated and to try breaking the task into smaller steps. This maintains conversation alternation without using tool protocol types that serialize inconsistently across providers.
Type of Change
Testing
All existing tests pass. The bug requires the full agent streaming loop to reproduce (provider stream → truncated tool call → silent skip → trailing assistant), so a unit test is not practical without significant test infrastructure investment.
See also