-
Notifications
You must be signed in to change notification settings - Fork 46
Add automatic file writing for large MCP tool outputs exceeding 16,000 tokens with compact schema descriptions #2060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ture Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Implement token counting using 4 chars = 1 token estimate
- Detect content exceeding 16000 tokens in tool outputs
- Write large content to files in /tmp/gh-aw/safe-outputs/ with SHA256 hash filenames
- Automatically detect file extension (.json, .md, .txt) based on content
- Return JSON format: { filename: "hash.ext", description: "generated content large!" }
- Replace large field in safe output with file reference
- Add comprehensive test coverage for all scenarios
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
All tests passing and manual verification complete Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot the MCP tools return JSON so the generate file extension should be .json. The description field should be replaced by a compact schema of the data that the "jq" / agent can understand. |
…h compact schema - Always use .json extension for all large content (MCP tools return JSON) - Replace static "generated content large!" with compact schema description - Schema describes structure: arrays show keys and item count, objects show keys - For non-JSON content, description is "text content" - Updated all tests to verify new behavior Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Updated in commit b63c8f8:
This makes it easier for jq and agents to understand the data structure. |
|
Agentic Changeset Generator triggered by this pull request. |
Update CLI and MCP server documentation to reflect recent feature additions: - Add --timeout option documentation for logs command with caching details - Add --parse option documentation for audit command - Add URL support documentation for audit command (cross-repo, GitHub Enterprise) - Document continuation field in MCP server logs tool for pagination - Document large output automatic file handling in MCP server (16K token threshold) These changes document features from PRs #2066, #2064, #2060, #2058, #2052, and #2051. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Problem
When MCP tools return very large outputs (e.g., generated documentation, data exports, or analysis reports), the content could cause issues with response handling and token limits. Previously, all tool outputs were returned as-is regardless of size.
Solution
Implemented automatic detection and file writing for large tool outputs in the
safe_outputs_mcp_server. When any string field in a tool response exceeds 16,000 tokens (~64,000 characters), the system now:/tmp/gh-aw/safe-outputs/using SHA256 hash +.jsonextension as filename.jsonextension since MCP tools return JSON data{ "filename": "hash.json", "description": "[{keys}] (N items)" }[{id, name, data}] (2000 items){key1, key2, ...}or{key1, ..., keyN, ...} (N keys)(truncated at 10 keys)text contentExample
Before (large content would be returned in full):
After (large content automatically handled):
Additional examples:
Implementation Details
.jsonsince MCP tools return JSON/tmp/gh-aw/safe-outputs/directory used for other artifactsTesting
Added comprehensive test coverage with 4 new test scenarios:
.jsonfile with schema description.jsonfile with array schema showing keys and count.jsonfile with object schema showing keysAll 568 JavaScript tests and all Go unit tests passing.
Fixes the issue where large tool outputs needed automatic file handling with meaningful schema descriptions for agent understanding.
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.