Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes MCP resource handling by decoding base64-encoded BlobResourceContents into usable text so goose can correctly surface downloaded resource content (e.g., from the GitHub MCP server).
Changes:
- Added
extract_text_from_resourceto decode base64 blobs and fall back to a binary placeholder when content isn’t valid UTF-8. - Updated OpenAI message formatting to use the new extraction helper for
RawContent::Resource. - Updated conversation message conversion to use the same helper and added focused unit tests.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| crates/goose/src/providers/utils.rs | Introduces base64-decoding helper for MCP resources and adds unit tests. |
| crates/goose/src/providers/formats/openai.rs | Uses the helper to convert MCP resource content into text for tool responses. |
| crates/goose/src/conversation/message.rs | Uses the helper when converting resource content into MessageContent. |
crates/goose/src/providers/utils.rs
Outdated
There was a problem hiding this comment.
The byte count in the binary placeholder uses blob.len() (base64 string length) rather than the decoded byte length, so the reported size will be misleading; capture bytes.len() before String::from_utf8 and use that instead.
| Ok(bytes) => match String::from_utf8(bytes) { | |
| Ok(text) => text, | |
| Err(_) => { | |
| let mime = mime_type | |
| .as_ref() | |
| .map(|m| m.as_str()) | |
| .unwrap_or("application/octet-stream"); | |
| format!("[Binary content ({}) - {} bytes]", mime, blob.len()) | |
| } | |
| }, | |
| Ok(bytes) => { | |
| let byte_len = bytes.len(); | |
| match String::from_utf8(bytes) { | |
| Ok(text) => text, | |
| Err(_) => { | |
| let mime = mime_type | |
| .as_ref() | |
| .map(|m| m.as_str()) | |
| .unwrap_or("application/octet-stream"); | |
| format!("[Binary content ({}) - {} bytes]", mime, byte_len) | |
| } | |
| } | |
| } |
| use crate::conversation::tool_result_serde; | ||
| use crate::mcp_utils::ToolResult; | ||
| use crate::providers::utils::extract_text_from_resource; | ||
| use crate::utils::sanitize_unicode_tags; |
There was a problem hiding this comment.
conversation/message.rs now imports crate::providers::utils, creating a bidirectional coupling (providers already depend on conversation::message); consider moving extract_text_from_resource to a neutral module (e.g., crate::utils/mcp_utils) to keep the core message model provider-agnostic.
There was a problem hiding this comment.
This seems like a good idea to ,e
462c31e to
2ce9841
Compare
2ce9841 to
61e7976
Compare
| RawContent::Resource(resource) => { | ||
| let text = match &resource.resource { | ||
| ResourceContents::TextResourceContents { | ||
| text, .. | ||
| } => text.clone(), | ||
| _ => String::new(), | ||
| }; | ||
| let text = extract_text_from_resource(&resource.resource); | ||
| tool_content.push(Content::text(text)); | ||
| } |
There was a problem hiding this comment.
This change alters OpenAI tool-response formatting for RawContent::Resource (now including decoded blob text), but there’s no test covering this path; add a unit test that builds a tool result containing a blob resource and asserts the formatted payload includes the decoded text.
| PromptMessageContent::Resource { resource } => { | ||
| // For resources, convert to text content with the resource text | ||
| match &resource.resource { | ||
| ResourceContents::TextResourceContents { text, .. } => { | ||
| MessageContent::text(text.clone()) | ||
| } | ||
| ResourceContents::BlobResourceContents { blob, .. } => { | ||
| MessageContent::text(format!("[Binary content: {}]", blob.clone())) | ||
| } | ||
| } | ||
| MessageContent::text(extract_text_from_resource(&resource.resource)) | ||
| } |
There was a problem hiding this comment.
The previous test coverage for blob resources in From<PromptMessage> was removed, and the new behavior (base64-decoding blob resources into text) isn’t asserted anywhere in this module; add/restore a test that constructs a BlobResourceContents with base64-encoded UTF-8 and verifies the resulting MessageContent::Text is the decoded text.
crates/goose/src/mcp_utils.rs
Outdated
There was a problem hiding this comment.
The byte count in the binary placeholder is computed with blob.len(), which is the base64 string length rather than the decoded byte length; use the decoded bytes.len() (and ideally assert the exact byte count in the unit test) to avoid misleading output.
codefromthecrypt
left a comment
There was a problem hiding this comment.
cool I like centralized code
crates/goose/src/mcp_utils.rs
Outdated
There was a problem hiding this comment.
since this is predictable and also doesn't include the actual raw bytes, just hte type and length, I think we can assert thew whole result right? this would also allow us to catch any incorrect length
| use crate::conversation::tool_result_serde; | ||
| use crate::mcp_utils::ToolResult; | ||
| use crate::providers::utils::extract_text_from_resource; | ||
| use crate::utils::sanitize_unicode_tags; |
There was a problem hiding this comment.
This seems like a good idea to ,e
crates/goose/src/mcp_utils.rs
Outdated
There was a problem hiding this comment.
can we do a #test_case for this? doesn't help much, but easier to read
61e7976 to
5592eee
Compare
…provenance * origin/main: (68 commits) Upgraded npm packages for latest security updates (#7183) docs: reasoning effort levels for Codex provider (#6798) Fix speech local (#7181) chore: add .gooseignore to .gitignore (#6826) Improve error message logging from electron (#7130) chore(deps): bump jsonwebtoken from 9.3.1 to 10.3.0 (#6924) docs: standalone mcp apps and apps extension (#6791) workflow: auto-update cli-commands on release (#6755) feat(apps): Integrate AppRenderer from @mcp-ui/client SDK (#7013) fix(MCP): decode resource content (#7155) feat: reasoning_content in API for reasoning models (#6322) Fix/configure add provider custom headers (#7157) fix: handle keyring fallback as success (#7177) Update process-wrap to 9.0.3 (9.0.2 is yanked) (#7176) feat: support extra field in chatcompletion tool_calls for gemini openai compat (#6184) fix: replace panic with proper error handling in get_tokenizer (#7175) Lifei/smoke test for developer (#7174) fix text editor view broken (#7167) docs: White label guide (#6857) Add PATH detection back to developer extension (#7161) ... # Conflicts: # .github/workflows/nightly.yml
* origin/main: (21 commits) nit: show dir in title, and less... jank (#7138) feat(gemini-cli): use stream-json output and re-use session (#7118) chore(deps): bump qs from 6.14.1 to 6.14.2 in /documentation (#7191) Switch jsonwebtoken to use aws-lc-rs (already used by rustls) (#7189) chore(deps): bump qs from 6.14.1 to 6.14.2 in /evals/open-model-gym/mcp-harness (#7184) Add SLSA build provenance attestations to release workflows (#7097) fix save and run recipe not working (#7186) Upgraded npm packages for latest security updates (#7183) docs: reasoning effort levels for Codex provider (#6798) Fix speech local (#7181) chore: add .gooseignore to .gitignore (#6826) Improve error message logging from electron (#7130) chore(deps): bump jsonwebtoken from 9.3.1 to 10.3.0 (#6924) docs: standalone mcp apps and apps extension (#6791) workflow: auto-update cli-commands on release (#6755) feat(apps): Integrate AppRenderer from @mcp-ui/client SDK (#7013) fix(MCP): decode resource content (#7155) feat: reasoning_content in API for reasoning models (#6322) Fix/configure add provider custom headers (#7157) fix: handle keyring fallback as success (#7177) ...
Fixes #6298
Decodes resource content when base64 encoded blobs are included so goose can make sense of the content