Skip to content

[smoke-detector] Comment on #2323 #2341

@github-actions

Description

@github-actions

🔄 REGRESSION ALERT - Run #38: Fix Did Not Work

Summary

The fix in commit f9dece6 did NOT resolve the issue. Run #38 failed with the exact same error as run #36, despite the commit claiming to "Fix JSON quoting in job-level env for MCP config" and referencing "Fixes #2323".

Failure Details

The Problem

Expected Result After Fix

✅ Valid JSON: "GH_AW_SAFE_OUTPUTS_CONFIG":"{\"create_issue\":{...}}"

Actual Result After Fix

❌ Still Invalid: "GH_AW_SAFE_OUTPUTS_CONFIG":""{\"create_issue\":{...}}""

The doubled quotes ("") are STILL present!

Error Details

Invalid JSON in --additional-mcp-config: Expected ',' or '}' after property value in JSON at position 557 (line 1 column 558)

Agent stdio.log shows:

Invalid JSON in --additional-mcp-config: Expected ',' or '}' after property value in JSON at position 557 (line 1 column 558)

Same position (557), same error, same root cause.

What the Fix Attempted

Commit f9dece6 claimed to:

  1. ✅ Modified jobs.go to quote env values that start with { or [ using single quotes
  2. ✅ Modified compiler.go to store JSON config without Go's %q formatting
  3. ✅ Single quotes in JSON are escaped by doubling them (YAML standard)
  4. ✅ Step-level env continues to use toJSON() for proper JSON encoding

BUT: These changes appear to have addressed YAML-level quoting only, not the actual JSON string construction for the --additional-mcp-config CLI argument.

Why The Fix Failed

Root Cause Analysis

The fix addressed the wrong layer of the problem:

  1. What was fixed: YAML compilation (jobs.go, compiler.go)

    • This ensures env vars are properly quoted in the .github/workflows/*.lock.yml files
    • ✅ This likely works correctly now
  2. What was NOT fixed: Runtime JSON construction

    • The --additional-mcp-config argument is built at runtime when executing the workflow
    • The code that constructs this JSON string (likely in copilot_engine.go or MCP-related files) was NOT modified
    • ❌ This still has the double-quoting bug

The Two Levels of JSON Generation

┌─────────────────────────────────────────────────────────┐
│ Level 1: Workflow YAML Generation (FIXED)              │
│ ─────────────────────────────────────────────────────── │
│ Go code: jobs.go, compiler.go                           │
│ Output: .github/workflows/*.lock.yml                    │
│ Purpose: Generate workflow file with env vars           │
│ Status: ✅ Likely fixed by f9dece69                     │
└─────────────────────────────────────────────────────────┘
                          ↓
                    Workflow runs
                          ↓
┌─────────────────────────────────────────────────────────┐
│ Level 2: CLI Argument Construction (NOT FIXED)         │
│ ─────────────────────────────────────────────────────── │
│ Go code: copilot_engine.go, mcp.go (?)                  │
│ Output: --additional-mcp-config '{"mcpServers":{...}}'  │
│ Purpose: Build JSON string passed to Copilot CLI        │
│ Status: ❌ STILL BROKEN - double-quotes remain          │
└─────────────────────────────────────────────────────────┘

Evidence

Looking at the actual command executed in run #38:

copilot --additional-mcp-config '{
  "mcpServers": {
    "safe_outputs": {
      "env": {
        "GH_AW_SAFE_OUTPUTS_CONFIG": ""{\"create_issue\":{...}}"" 
        #                             ↑↑                      ↑↑
        #                             Double quotes still present!
      }
    }
  }
}'

The env var at the YAML level might be correct now, but when it's read and used to build the --additional-mcp-config JSON at runtime, something is adding the extra quotes again.

Where to Look Next

Priority 1: Find the CLI Command Construction Code

The code that actually builds the copilot --additional-mcp-config '...' command line was not modified by f9dece6. We need to find and fix:

# Likely locations:
grep -rn "additional-mcp-config" pkg/
grep -rn "mcpServers" pkg/
grep -rn "buildCopilotCommand" pkg/
grep -rn "buildMCPConfig" pkg/

Most likely files:

  • pkg/workflow/copilot_engine.go - Copilot execution logic
  • pkg/workflow/mcp.go or pkg/workflow/mcp_*.go - MCP configuration
  • pkg/workflow/run_agent.go - Agent execution

Priority 2: Check How Env Vars Are Consumed

Even if the YAML is correct, we need to trace:

  1. How is GH_AW_SAFE_OUTPUTS_CONFIG read from the environment at runtime?
  2. How is that value embedded into the MCP config JSON?
  3. Is it being re-escaped or re-quoted somewhere?
// Hypothetical bug pattern:
envValue := os.Getenv("GH_AW_SAFE_OUTPUTS_CONFIG")  
// envValue = `{"create_issue":{...}}`  - correct

// WRONG: This adds extra quotes:
mcpConfigJSON := fmt.Sprintf(`"GH_AW_SAFE_OUTPUTS_CONFIG":"%s"`, envValue)
// Result: "GH_AW_SAFE_OUTPUTS_CONFIG":"{"create_issue":{...}}"  
//         Missing escaping!

// ALSO WRONG: Over-escaping:
escapedValue := fmt.Sprintf(`"%s"`, envValue)  // Now: ""{"create_issue":{...}}""
mcpConfigJSON := fmt.Sprintf(`"GH_AW_SAFE_OUTPUTS_CONFIG":%s`, escapedValue)
// Result: "GH_AW_SAFE_OUTPUTS_CONFIG":""{"create_issue":{...}}""
//         This is what we're seeing!

Priority 3: Add Debug Logging

Before the copilot command is executed, log:

log.Printf("DEBUG: Raw env var GH_AW_SAFE_OUTPUTS_CONFIG = %q", os.Getenv("GH_AW_SAFE_OUTPUTS_CONFIG"))
log.Printf("DEBUG: Generated MCP config JSON = %s", mcpConfigJSON)

// Validate before passing to CLI:
var test map[string]interface{}
if err := json.Unmarshal([]byte(mcpConfigJSON), &test); err != nil {
    log.Printf("ERROR: Generated invalid MCP config JSON: %v", err)
    log.Printf("ERROR: Problematic JSON: %s", mcpConfigJSON)
    return fmt.Errorf("generated MCP config is not valid JSON: %w", err)
}

Recommended Fix Strategy

Step 1: Locate the Bug

cd pkg/workflow
rg "additional-mcp-config" -A 10 -B 5
rg "buildMCPConfig" -A 20
rg "mcpServers" -A 10

Step 2: Understand the Code Path

  1. Find where the copilot CLI command is constructed
  2. Trace back to where GH_AW_SAFE_OUTPUTS_CONFIG is read
  3. Identify where it's embedded into the JSON
  4. Find the line that adds the extra quotes

Step 3: Apply the Correct Fix

The fix needs to handle JSON-valued env vars correctly:

// Detect if value is JSON:
func isJSONValue(val string) bool {
    trimmed := strings.TrimSpace(val)
    return (strings.HasPrefix(trimmed, "{") && strings.HasSuffix(trimmed, "}")) ||
           (strings.HasPrefix(trimmed, "[") && strings.HasSuffix(trimmed, "]"))
}

// When building env section of MCP config:
envJSON := make(map[string]interface{})
for key, value := range envVars {
    if isJSONValue(value) {
        // Parse as JSON to preserve structure:
        var parsed interface{}
        if err := json.Unmarshal([]byte(value), &parsed); err == nil {
            envJSON[key] = parsed  // Store as object/array, not string
        } else {
            envJSON[key] = value  // Fall back to string if not valid JSON
        }
    } else {
        envJSON[key] = value  // Store as string
    }
}

// Then use json.Marshal to build the final JSON:
mcpConfig := map[string]interface{}{
    "mcpServers": map[string]interface{}{
        "safe_outputs": map[string]interface{}{
            "env": envJSON,
            // ...
        },
    },
}

mcpConfigBytes, _ := json.Marshal(mcpConfig)
mcpConfigJSON := string(mcpConfigBytes)

Step 4: Add Validation

// Before executing the command:
if err := validateMCPConfigJSON(mcpConfigJSON); err != nil {
    return fmt.Errorf("generated invalid MCP config: %w", err)
}

func validateMCPConfigJSON(jsonStr string) error {
    var config map[string]interface{}
    if err := json.Unmarshal([]byte(jsonStr), &config); err != nil {
        // Log the problematic JSON for debugging:
        lines := strings.Split(jsonStr, "\n")
        for i, line := range lines {
            log.Printf("Line %d: %s", i+1, line)
        }
        return fmt.Errorf("invalid JSON: %w", err)
    }
    return nil
}

Step 5: Add Test

func TestMCPConfigWithJSONEnvVar(t *testing.T) {
    envVars := map[string]string{
        "PLAIN_VAR": "simple value",
        "JSON_VAR": `{"key":"value"}`,
    }
    
    configJSON := buildMCPConfigJSON(envVars)
    
    // Must be valid JSON:
    var parsed map[string]interface{}
    err := json.Unmarshal([]byte(configJSON), &parsed)
    require.NoError(t, err, "Generated MCP config must be valid JSON")
    
    // Check that JSON env var is parsed as object, not string:
    servers := parsed["mcpServers"].(map[string]interface{})
    safeOutputs := servers["safe_outputs"].(map[string]interface{})
    env := safeOutputs["env"].(map[string]interface{})
    
    // JSON_VAR should be an object, not a string:
    jsonVar := env["JSON_VAR"]
    assert.IsType(t, map[string]interface{}{}, jsonVar)
}

Timeline of Failure

Run # Date/Time Status Notes
36 Oct 24 21:10 ❌ Failed Original report (this issue)
37 Oct 24 22:43 ❌ Failed Marked as duplicate
- Oct 24 23:13 - Commit f9dece6 with attempted fix
38 Oct 24 23:22 ❌ Failed REGRESSION - Fix didn't work

Time between fix commit and regression: 9 minutes

Conclusion

The fix attempted in f9dece6 was incomplete. It addressed YAML generation but not runtime JSON construction. The bug lives in the code that builds the --additional-mcp-config argument, which was not modified by the fix.

Next Steps:

  1. ❌ Close this issue? NO - it's still broken
  2. ✅ Reopen investigation into runtime JSON construction
  3. ✅ Find and fix the actual code that builds the CLI argument
  4. ✅ Add validation and tests at the right level

Investigation stored: /tmp/gh-aw/cache-memory/investigations/2025-10-24-18794534174.json
Pattern ID: COPILOT_DOUBLE_QUOTED_ENV_VAR_IN_MCP_CONFIG
Status: REGRESSION AFTER ATTEMPTED FIX

AI generated by Smoke Detector - Smoke Test Failure Investigator

AI generated by Smoke Detector - Smoke Test Failure Investigator

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions