fix: handle truncated tool calls that break conversation alternation by tlongwell-block · Pull Request #7424 · block/goose

tlongwell-block · 2026-02-22T22:44:56Z

Summary

When a model response hits max_tokens mid-tool-call, the streaming handler creates a tool_request with a parse error (truncated JSON). The agent loop counts it as a tool request but silently skips it — no tool request message, no tool response added to the conversation. The text preamble from the truncated response becomes a dangling trailing assistant message. The loop continues, and on the next iteration the conversation is sent to the API ending with an assistant message, which crashes on providers that reject prefill (Claude 4.6+, Snowflake, others).

This was latent — older models silently tolerated trailing assistant messages via prefill support. Claude Opus 4.6 (Feb 5) removed prefill as a documented breaking change, exposing it. Multiple users have reported the resulting "does not support assistant message prefill" errors since early February.

The fix

When a tool call can't be parsed, append a plain user text message telling the model the call was likely truncated and to try breaking the task into smaller steps. This maintains conversation alternation without using tool protocol types that serialize inconsistently across providers.

Type of Change

Bug fix

Testing

All existing tests pass. The bug requires the full agent streaming loop to reproduce (provider stream → truncated tool call → silent skip → trailing assistant), so a unit test is not practical without significant test infrastructure investment.

PR #7424 Review: fix: handle truncated tool calls that break conversation alternation

Summary: This PR correctly fixes a conversation alternation issue where truncated tool calls (from max_tokens limits) create trailing assistant messages that crash newer providers like Claude 4.6. The fix adds a user message when tool parsing fails, maintaining proper alternation. The approach is sound and minimal.

✅ Highlights

Correct Root Cause Fix: The fix addresses the actual problem - when tool_call.is_err(), the original filtered_response is still yielded (line 1202), creating an assistant message. Previously nothing was added for error cases, leaving the conversation with a trailing assistant message. Now a user message is added to maintain alternation.
Works with Existing Infrastructure: The merge_consecutive_messages function (conversation/mod.rs:401) handles the case where multiple error tool calls could create consecutive user messages - they get merged automatically.
Informative Error Message: The error includes both the original error detail and actionable guidance ("Try breaking the task into smaller steps").

🟢 Suggestions

Consider Using with_system_notification (line 1447-1455):

The nearby ContextLengthExceeded handling (line 1468-1472) uses with_system_notification(SystemNotificationType::InlineMessage, ...) for similar error messaging. However, that sets agent_visible: false, which would defeat the purpose here since the LLM needs to see this error to adjust. The current approach using Message::user().with_text(...) is actually correct because the LLM needs visibility into the error.

That said, the "[system: ...]" prefix is a new pattern. Consider if there's a more structured way to indicate this is a system-generated message, though the current approach works fine.
Minor: Error Message Clarity:

The error includes e (the ErrorData) which may contain technical details. The user-facing suggestion is good. Could consider:
```
format!(
    "[system: The model's response was truncated while generating a tool call. Try breaking the task into smaller steps.]\n\nTechnical details: {}",
    e
)
```
This separates the actionable advice from the technical details. This is very minor and the current format is acceptable.

Verification Notes

Traced the data flow:

Line 1202: filtered_response (with error tool request) is YIELDED → goes to UI/server's all_messages
Line 1206-1213: Response is NOT added to messages_to_add when num_tool_requests > 0
Line 1422-1441: For OK requests, creates assistant+user message pairs
Line 1447-1455 (NEW): For ERR requests, creates user error message

Mixed OK/ERR scenario verified: If there are 3 tool requests (OK, ERR, OK):

OK → assistant msg + user tool response
ERR → user error message
OK → assistant msg + user tool response

The consecutive user messages (from #1 response + #2 error) are handled by merge_consecutive_messages.

Review generated by goose

DOsinga · 2026-02-23T14:39:28Z

this is good. we could recover from this though if we would do the pre-fill thing, yeah? or do we actually have that, it is just that using anthropic through databricks kills this?

tlongwell-block · 2026-02-23T14:46:13Z

this is good. we could recover from this though if we would do the pre-fill thing, yeah? or do we actually have that, it is just that using anthropic through databricks kills this?

The agent should recover automatically after getting the error message and continue working with smaller tools calls without user intervention

DOsinga · 2026-02-23T14:55:02Z

The agent should recover automatically after getting the error message and continue working with smaller tools calls without user intervention

no, I understand. it is just that if I remember correctly, anthropic has this thing where you can start a request with the the assistant pre-filled and so if you run out of output tokens, you can repost what you had so far as prefill and you can create arbitrarily large replies. I wonder if that is already supported, just not through databricks

…oviders * 'main' of github.com:block/goose: New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) # Conflicts: # ui/desktop/src/components/settings/dictation/DictationSettings.tsx

…xt-edit * origin/main: (35 commits) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) ...

* main: Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419)

* main: (171 commits) fix: TLDR CLI tab in Neighborhood MCP docs (#7461) fix(summon): restore skill supporting files and directory path in load output (#7457) Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) ...

* origin/main: (49 commits) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) docs: voice dictation updates (#7396) [docs] Add Excalidraw MCP App Tutorial (#7401) ... # Conflicts: # ui/desktop/src/components/McpApps/McpAppRenderer.tsx

tlongwell-block requested a review from DOsinga February 22, 2026 22:46

tlongwell-block mentioned this pull request Feb 22, 2026

fix: MOIM injection crashes session when conversation ends with assistant message #7421

Closed

3 tasks

tlongwell-block marked this pull request as draft February 22, 2026 22:52

tlongwell-block force-pushed the fix/trailing-assistant-root-cause branch from b3821bc to 1d8ab66 Compare February 23, 2026 00:05

fix: handle truncated tool calls that break conversation alternation

b452098

tlongwell-block force-pushed the fix/trailing-assistant-root-cause branch from 1d8ab66 to b452098 Compare February 23, 2026 00:09

tlongwell-block changed the title ~~fix: clear stale messages after retry resets conversation~~ fix: handle truncated tool calls that break conversation alternation Feb 23, 2026

tlongwell-block marked this pull request as ready for review February 23, 2026 13:49

jh-block approved these changes Feb 23, 2026

View reviewed changes

tlongwell-block added this pull request to the merge queue Feb 23, 2026

Merged via the queue into main with commit 33af644 Feb 23, 2026
35 of 41 checks passed

tlongwell-block deleted the fix/trailing-assistant-root-cause branch February 23, 2026 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: handle truncated tool calls that break conversation alternation#7424

fix: handle truncated tool calls that break conversation alternation#7424
tlongwell-block merged 1 commit intomainfrom
fix/trailing-assistant-root-cause

tlongwell-block commented Feb 22, 2026 •

edited

Loading

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

DOsinga commented Feb 23, 2026

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

Uh oh!

DOsinga commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

tlongwell-block commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The fix

Type of Change

Testing

See also

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026

PR #7424 Review: fix: handle truncated tool calls that break conversation alternation

✅ Highlights

🟢 Suggestions

Verification Notes

Uh oh!

DOsinga commented Feb 23, 2026

Uh oh!

tlongwell-block commented Feb 23, 2026

Uh oh!

Uh oh!

DOsinga commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tlongwell-block commented Feb 22, 2026 •

edited

Loading