Filter preserved user messages to be text only.#5391
Conversation
crates/goose/src/context_mgmt/mod.rs
Outdated
| .iter() | ||
| .rev() | ||
| .find(|msg| matches!(msg.role, rmcp::model::Role::User)) | ||
| .find(|msg| matches!(msg.role, rmcp::model::Role::User) && has_text_content(msg)) |
There was a problem hiding this comment.
we should check that the message does not contain a tool result, not that it contains text (I'm not sure how likely in practice, but you could have both)
There was a problem hiding this comment.
Makes sense. Maybe check for both, or something very specific? contains exactly 1 text and no tool result?
There was a problem hiding this comment.
the most important thing is that we don't keep the tool result around without a tool request, because that will trigger 400s from providers. So, yeah, anything that includes that check is probably sufficient for a quick fix
There was a problem hiding this comment.
I would say, don't preserve messages at all. extract the text you want to keep and construct a new message. that way we avoid all the weirdness
crates/goose/src/context_mgmt/mod.rs
Outdated
| let messages = conversation.messages(); | ||
|
|
||
| // Check if the most recent message is a user message | ||
| // Helper to check if a message has text content ONLY (no tool requests/responses) |
crates/goose/src/context_mgmt/mod.rs
Outdated
| } | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
These tests read like LLM generated tests and verify your implementation, not the actual breakage. Can we replace this with what Jack had? What we want to test here is a series of scenarios where we call compaction and then verify that after compaction the assistant visible messages form a conversation that is valid according to the conversation fixer. And those scenarios should reflect the actual breakagaes that we've seen over the last few months and be extendable any time we see a new breakage.
| .find(|msg| matches!(msg.role, rmcp::model::Role::User)) | ||
| .cloned(); | ||
| (messages.as_slice(), most_recent_user_message) | ||
| .find(|msg| matches!(msg.role, rmcp::model::Role::User) && has_text_only(msg)) |
There was a problem hiding this comment.
is this definitely what we want though? that user message might be pretty far back in a tool calling loop...
for a patch, should we maybe just include the last user message iff it is a text, and do nothing if it isn't? that feels more conservative, and we can explore other ways to improve the quality in main
There was a problem hiding this comment.
I think that might have been better, but I'd rather get something out sooner than later. we can then make a plan of what we really want
There was a problem hiding this comment.
merging as is since this mirrors the intended previous behavior, but lets keep the discussion open as to what exactly we want.
The other thing we might need to worry about if we continue the loop is with smaller models it could be possible to get stuck in an infinite loop.
* main: Auto-compact Threshold UI improvements (#5354) Filter preserved user messages to be text only. (#5391) include sessionId in tool request (#5394) feat: add PR Impact Analyzer prompt (#5375) docs: add blog post on configuring goose for team environments (#5380) migrating back with new chatrecall non underscore name (#5223) fixing typo in blog metadata (#5382) feat: add new blog entry on adopting Goose in the enterprise (#5381) Blog/acp intro oct 2024 (#5379) feat: add A/B test framework generator recipe (#5378) Doc: (blog) - Deep Dive into goose's Extension System and Model Context Protocol (MCP) (#5291) Some system prompt tidying (#5313) Fix scheduler jobs dates formatting (#5368) Use Instructions as Prompt in Scheduler (#5359) feat(snowflake): add support for newer Claude 4.5 and 4 models (#5350)
* main: Feat/add mermaid chart rendering (#5377) Set up Datadog metrics for prompt injection detection (#5385) fix: restore --resume functionality for most recent session (#5401) Gemini again (#5390) docs(prompt-library): add github-issue-labeler intermediate prompt (#5374) docs: add Linux and Windows paths to uninstall section (#5371) fix: --session-id shouldn't work without --resume, but --name should (#5360) Auto-compact Threshold UI improvements (#5354) Filter preserved user messages to be text only. (#5391) include sessionId in tool request (#5394) feat: add PR Impact Analyzer prompt (#5375) docs: add blog post on configuring goose for team environments (#5380) migrating back with new chatrecall non underscore name (#5223)
* 'main' of github.com:block/goose: (132 commits) Fix/icon ii (#5413) Enable runtime access to provider name (#5399) fix: ensure trailing newline in files created by `text_editor` tool (#5336) docs: September 2025 Community All-Stars (#5411) make supports_cache_control async to avoid block in place (#5362) Send all the logs we output (#5363) Recipe variables (#5365) Feat/add mermaid chart rendering (#5377) Set up Datadog metrics for prompt injection detection (#5385) fix: restore --resume functionality for most recent session (#5401) Gemini again (#5390) docs(prompt-library): add github-issue-labeler intermediate prompt (#5374) docs: add Linux and Windows paths to uninstall section (#5371) fix: --session-id shouldn't work without --resume, but --name should (#5360) Auto-compact Threshold UI improvements (#5354) Filter preserved user messages to be text only. (#5391) include sessionId in tool request (#5394) feat: add PR Impact Analyzer prompt (#5375) docs: add blog post on configuring goose for team environments (#5380) migrating back with new chatrecall non underscore name (#5223) ...
Signed-off-by: Blair Allan <Blairallan@icloud.com>
Avoids orphaning a tool call.
In the case of out of context error we keep the last user request.
Been testing with the proxy + this query
run ls, get the 3rd biggest file. Count the words. Mod the number of words with the number of files. Get another file and do the same. Run this loop 7 times.