-
Notifications
You must be signed in to change notification settings - Fork 38
Description
🔍 Smoke Test Investigation - Run #18768028687
Summary
The Smoke Copilot workflow failed immediately when the Copilot CLI attempted to parse the --additional-mcp-config argument. The CLI received a base64-encoded string but expected raw JSON, causing it to exit with an error before any agent execution could begin.
Failure Details
- Run: #18768028687
- Commit: e2f31b3
- Branch: copilot/update-copilot-engine-cli-config
- Trigger: workflow_dispatch
- Duration: 1.1 minutes
- Failed Job: agent (25s duration)
Root Cause Analysis
Primary Error
Invalid JSON in --additional-mcp-config: Unexpected token 'e', "ewogICJtY3"... is not valid JSON
The error occurred at pkg/workflow/copilot_engine.go:238 where the Copilot CLI is executed with:
copilot --add-dir /tmp/ --add-dir /tmp/gh-aw/ --add-dir /tmp/gh-aw/agent/ \
--log-level all --log-dir /tmp/gh-aw/.copilot/logs/ \
--disable-builtin-mcps \
--additional-mcp-config ewogICJtY3BTZXJ2ZXJzIjogewogICAgImdpdGh1YiI6IHsK...Investigation Findings
The Problem: Workflow-Code Mismatch
The commit e2f31b3 made these changes:
- Removed base64 encoding/decoding from
copilot_engine.go - Changed to pass JSON directly via
shellEscapeArgfor proper shell quoting - Updated tests to expect JSON instead of base64
However, the compiled workflow YAML on the branch still contains the base64-encoded configuration from an earlier compilation, creating a mismatch:
| Component | Format | Status |
|---|---|---|
| Go Code (copilot_engine.go) | JSON string with shell escaping | ✅ Updated |
| Compiled Workflow (.github/workflows/*.yml) | Base64-encoded string | ❌ Outdated |
What the Base64 Decodes To
The base64 string ewogICJtY3BTZXJ2ZXJzIjogewog... decodes to valid JSON:
{
"mcpServers": {
"github": {
"type": "local",
"command": "docker",
"args": ["run", "-i", "--rm", "-e", "GITHUB_PERSONAL_ACCESS_TOKEN", ...],
"tools": ["*"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "\\${GITHUB_MCP_SERVER_TOKEN}"
}
},
"safe_outputs": {
"type": "local",
"command": "node",
"args": ["/tmp/gh-aw/safe-outputs/mcp-server.cjs"],
"tools": ["*"],
"env": { ... }
}
}
}So the content is correct, but the encoding is wrong for the updated code.
Failed Jobs and Errors
Job Sequence
- ✅ activation - succeeded (4s)
- ❌ agent - failed (25s) - Copilot CLI couldn't parse base64 as JSON
- ❌ create_issue - failed (7s) - Dependent on agent job
- ⏭️ missing_tool - skipped
- ⏭️ detection - skipped
Error Details
Location: /tmp/gh-aw/agent-stdio.log:1
Invalid JSON in --additional-mcp-config: Unexpected token 'e', "ewogICJtY3"... is not valid JSON
Context: The Copilot CLI validates the --additional-mcp-config argument immediately upon startup. When it receives ewogICJtY3..., it attempts to parse it as JSON (not as base64), fails, and exits with code 1.
Commit Analysis
Commit: e2f31b3
Author: copilot-swe-agent[bot]
Message: "Replace base64 encoding with quoted JSON for --additional-mcp-config"
Files Changed (245 lines total):
pkg/workflow/copilot_engine.go(+4/-8)pkg/workflow/copilot_engine_test.go(+15/-34)pkg/workflow/copilot_mcp_http_integration_test.go(+24/-48)pkg/workflow/imports_test.go(+19/-93)
Intent: Remove base64 complexity and pass JSON directly, relying on shellEscapeArg() function to properly quote the JSON string for shell safety.
The Code Changes Are Correct: The Go code correctly builds JSON and passes it through shellEscapeArg for proper quoting. The issue is that the workflow YAML wasn't recompiled after these changes.
Recommended Actions
Critical Priority
-
Recompile the workflow on the branch
# On branch copilot/update-copilot-engine-cli-config gh aw compile .github/workflows/smoke-copilot.md git add .github/workflows/smoke-copilot.yml git commit -m "Recompile workflow after MCP config format change"
Rationale: This will regenerate the workflow YAML with JSON instead of base64
-
Verify the compiled workflow contains JSON, not base64
# Check that --additional-mcp-config is followed by JSON, not base64 grep "additional-mcp-config" .github/workflows/smoke-copilot.yml
High Priority
-
Add CI check to detect workflow compilation drift
- Create a CI job that runs
gh aw compile --checkto verify workflows are up-to-date - Fail the build if compiled workflows differ from their source
.mdfiles - Rationale: Prevent code/workflow mismatches in the future
- Create a CI job that runs
-
Document workflow recompilation requirements
- Add section to CONTRIBUTING.md explaining when to recompile workflows
- Specifically: "When modifying engine code (copilot_engine.go, claude_engine.go, etc.), recompile all workflows"
- Rationale: Make the requirement visible to contributors
Medium Priority
-
Consider pre-commit hook for workflow compilation
# Hook that detects engine code changes and prompts for recompilation if git diff --cached --name-only | grep -q "pkg/workflow/.*_engine.go"; then echo "⚠️ Engine code changed - consider running: gh aw compile" fi
-
Automate workflow compilation in PR process
- Add GitHub Action that runs
gh aw compileand commits changes - Trigger on changes to
pkg/workflow/*_engine.gofiles - Rationale: Remove manual step from workflow
- Add GitHub Action that runs
Prevention Strategies
-
Compilation Check in CI: Add a job that verifies compiled workflows match their sources
- name: Check workflow compilation run: | gh aw compile if ! git diff --quiet .github/workflows/*.yml; then echo "❌ Compiled workflows are out of date" git diff .github/workflows/ exit 1 fi
-
Pre-commit Hook: Warn developers when engine code changes without workflow recompilation
-
Documentation: Make workflow compilation requirements explicit in contributor docs
-
Automated PR Updates: Bot automatically recompiles workflows when engine code changes
Historical Context
This is a new failure pattern - first occurrence. The pattern database shows no similar failures with MCP config format mismatches.
Similar Historical Issues:
- Various smoke test failures (permission issues, API errors, detection failures)
- But none related to workflow compilation drift
Pattern Classification:
- Category: Configuration Error
- Severity: High
- Is Flaky: No - deterministic failure
- Is Recurring: No - first occurrence
- Pattern ID:
COPILOT_BASE64_MCP_CONFIG
Technical Details
Expected Behavior (After Fix)
copilot --additional-mcp-config '{
"mcpServers": {
"github": { ... },
"safe_outputs": { ... }
}
}'With proper shell quoting applied by shellEscapeArg()
Actual Behavior (Current)
copilot --additional-mcp-config ewogICJtY3BTZXJ2ZXJzIjogewog...Base64 string passed directly, which Copilot CLI cannot parse
shellEscapeArg Function
From pkg/workflow/shell.go:17-35, this function:
- Wraps arguments containing special characters in single quotes
- Escapes single quotes within the argument
- Leaves already-quoted strings unchanged
This is the correct approach for passing JSON safely through shell.
Related Information
- Workflow Source:
.github/workflows/smoke-copilot.md - Engine Code:
pkg/workflow/copilot_engine.go - Compilation Tool:
gh aw compile - Tests: All unit tests pass (changes were properly tested)
Pattern Storage
Investigation saved to: /tmp/gh-aw/cache-memory/investigations/2025-10-24-18768028687.json
Pattern saved to: /tmp/gh-aw/cache-memory/patterns/copilot_base64_mcp_config.json
Investigation Metadata:
- Investigator: Smoke Detector
- Investigation Run: #18768045624
- Pattern ID: COPILOT_BASE64_MCP_CONFIG
- Severity: High
- Is Flaky: No
Labels: smoke-test, investigation, copilot, configuration, workflow-compilation
AI generated by Smoke Detector - Smoke Test Failure Investigator