-
Notifications
You must be signed in to change notification settings - Fork 37
Closed
Labels
Description
🚨 High Priority: Metrics Collector Workflow Failing
Summary
The Metrics Collector workflow has degraded to 50% success rate (5/10 runs) starting January 9, 2026. This is blocking all workflow health monitoring and metrics collection.
Error Details
- Error Type: MCP Gateway configuration validation error
- MCP Gateway Version: v0.0.47
- Success Rate: 50% (was 100% before 2026-01-09)
- Last Success: Run Add HTTP transport capability validation and time MCP server for agentic engines #19 (2026-01-08)
- Impact: No workflow metrics collected for 5+ days
Root Cause
MCP Gateway schema breaking change in v0.0.47 requires:
- New
containerproperty for stdio server config - Removal of
commandproperty (now invalid) - Updated configuration structure
Error Messages
Error: failed to load config: Configuration validation error (MCP Gateway version: v0.0.47):
Error: doesn't validate with mcp-gateway-config.schema.json#
Error: doesn't validate with '/definitions/mcpServerConfig'
Error: oneOf failed
Error: doesn't validate with '/definitions/stdioServerConfig'
Error: missing properties: 'container'
Error: additionalProperties 'command' not allowed
Sample Failed Run
- Run: https://github.com/githubnext/gh-aw/actions/runs/20960733354
- Date: 2026-01-13
- Run Number: #24
Required Fix
Update MCP Server Configuration
The workflow needs to migrate from old schema:
tools:
server_name:
command: "/path/to/server"
args: ["--arg1"]
env:
VAR: "value"To new schema:
tools:
server_name:
container:
image: "server-image:latest"
# or for stdio mode
# command: "/path/to/server"
# args: ["--arg1"]
env:
VAR: "value"Recommended Actions
Immediate (P1)
- Review Workflow File:
.github/workflows/metrics-collector.md - Update MCP Configuration: Migrate to new schema format
- Test with MCP Gateway v0.0.47: Validate configuration
- Recompile Workflow: Run
make recompileafter changes
Follow-up
- Audit all workflows for MCP Gateway usage
- Document schema migration guide
- Add schema validation to CI/CD
- Consider pinning MCP Gateway version
Impact Assessment
Direct Impact
- No workflow metrics since 2026-01-09
- Workflow Health Manager lacks execution data
- Agent Performance Analyzer missing metrics
- Unable to track success rates, failures, performance
Cascade Impact
- Meta-orchestrator monitoring degraded
- System health visibility lost
- Other orchestrators (Campaign Manager) affected
- Cannot detect new failures or trends
Related Issues
- Part of MCP Gateway breaking change affecting multiple workflows
- Meta-orchestrator cascade failure
- Highest priority for system visibility restoration
Detection
Identified by Workflow Health Manager on 2026-01-14
Health Score Impact: -10 points (75/100 overall)
Labels: workflow-health, priority-p1, type-failure, mcp-gateway
Related Workflows: Agent Performance Analyzer, Workflow Health Manager
AI generated by Workflow Health Manager - Meta-Orchestrator
Copilot