Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements audience-based content filtering for conversation compaction. When compacting conversations, content tagged for specific audiences (e.g., user-only) is now filtered out before sending to the LLM, reducing token usage by approximately 44% (from 229,887 to 128,516 tokens in the test case).
Changes:
- Implements content filtering by audience role for compaction
- Removes verbose LLM request debug logging
- Optimizes compaction formatting by using compact JSON and simplified thinking representation
- Increases session import body limit to 25MB
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| crates/goose/src/conversation/message.rs | Adds filter_for_audience() and agent_visible_content() methods to filter message content by audience role |
| crates/goose/src/context_mgmt/mod.rs | Updates compaction to use audience filtering, changes to compact JSON formatting, simplifies thinking representation |
| crates/goose/src/providers/api_client.rs | Removes debug logging of LLM request payloads |
| crates/goose-server/src/routes/session.rs | Increases body limit for session import endpoint to 25MB |
| pub fn filter_for_audience(&self, audience: Role) -> Option<MessageContent> { | ||
| match self { | ||
| MessageContent::Text(text) => { | ||
| if text.audience().map(|roles| roles.contains(&audience)).unwrap_or(true) { | ||
| Some(self.clone()) | ||
| } else { | ||
| None | ||
| } | ||
| } | ||
| MessageContent::Image(img) => { | ||
| if img.audience().map(|roles| roles.contains(&audience)).unwrap_or(true) { | ||
| Some(self.clone()) | ||
| } else { | ||
| None | ||
| } | ||
| } | ||
| MessageContent::ToolResponse(res) => { | ||
| let Ok(result) = &res.tool_result else { | ||
| return Some(self.clone()); | ||
| }; | ||
|
|
||
| let filtered_content: Vec<Content> = result | ||
| .content | ||
| .iter() | ||
| .filter(|c| { | ||
| c.audience() | ||
| .map(|roles| roles.contains(&audience)) | ||
| .unwrap_or(true) | ||
| }) | ||
| .cloned() | ||
| .collect(); | ||
|
|
||
| if filtered_content.is_empty() { | ||
| return None; | ||
| } | ||
|
|
||
| Some(MessageContent::ToolResponse(ToolResponse { | ||
| id: res.id.clone(), | ||
| tool_result: Ok(CallToolResult { | ||
| content: filtered_content, | ||
| ..result.clone() | ||
| }), | ||
| metadata: res.metadata.clone(), | ||
| })) | ||
| } | ||
| MessageContent::Thinking(_) | MessageContent::RedactedThinking(_) => None, | ||
| _ => Some(self.clone()), | ||
| } | ||
| } |
There was a problem hiding this comment.
The new public methods filter_for_audience and agent_visible_content lack test coverage. Consider adding tests to verify the filtering logic, especially for edge cases like empty content, tool responses with mixed audience tags, and content with no audience specified.
There was a problem hiding this comment.
Nice find. Seems like lots of the provider format code is doing the audience stripping which explains the difference in the compaction vs llm call context.
Maybe we should follow up and make this more general and push this filter_for_audience more generically into the flow of provider.complete, and get rid of the provider format specific filters?
|
I think the format code is actually wrong; the LLMs are supposed to ignore this already so it should all be good to send them this; just dont want to compact it. |
* main: docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) chore: trigger CI Document tab completion (#6635) Install goose-mcp crate dependencies (#6632) feat(goose): standardize agent-session-id for session correlation (#6626)
Co-authored-by: Douwe Osinga <douwe@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
* origin/main: Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) # Conflicts: # Cargo.lock # crates/goose-server/src/routes/session.rs
…o dkatz/canonical-context * 'dkatz/canonical-provider' of github.com:block/goose: (27 commits) docs: add Remotion video creation tutorial (#6675) docs: export recipe and copy yaml (#6680) Test against fastmcp (#6666) docs: mid-session changes (#6672) Fix MCP elicitation deadlock and improve UX (#6650) chore: upgrade to rmcp 0.14.0 (#6674) [docs] add MCP-UI to MCP Apps blog (#6664) ACP get working dir from args.cwd (#6653) Optimise load config in UI (#6662) Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) ...
Summary
We weren't stripping audience = user tags.
my conversation to be compacted came in at 229,887 before, now it is 128,516 - to be honest seems like almost too steep a change