-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Critical Blocker for Nightly Stress Test Workflow
The nightly MCP server stress test workflow cannot execute due to a fundamental environment constraint: Docker-in-Docker support is not available in the AWF firewall container.
Test Session Details
- Test Session:
stress-test-20260204-033819 - Test Date: 2026-02-04T03:42:00Z
- Workflow:
.github/workflows/nightly-mcp-stress-test.md - Status: ❌ BLOCKED - Cannot Execute
Problem Summary
The stress test attempts to launch 20 MCP servers as Docker containers, but all 20 servers fail immediately because Docker commands are blocked by AWF.
Error Message from MCP Gateway:
ERROR: Docker-in-Docker support was removed in AWF v0.9.1
Docker commands are no longer available inside the firewall container.
If you need to:
- Use MCP servers: Migrate to stdio-based MCP servers (see docs)
- Run Docker: Execute Docker commands outside AWF wrapper
- Build images: Run Docker build before invoking AWF
See PR #205: https://github.com/github/gh-aw-firewall/pull/205
Root Cause
- AWF Security Policy: Docker-in-Docker explicitly disabled in AWF v0.9.1 (PR [duplicate-code] Config Validation Logic Duplication (Medium Severity) #205)
- Test Design: All 20 MCP servers configured as
container: "mcp/*"orcontainer: "ghcr.io/*" - Gateway Behavior: Gateway uses
docker runto launch container-based servers - Environment: Workflow runs inside AWF firewall container with no Docker access
- Result: Zero servers can launch → zero servers can be tested
Impact
Test Coverage: 0/20 servers tested (0%)
All 20 attempted servers failed with identical Docker availability errors:
github(ghcr.io/github/github-mcp-server:v0.30.2)filesystem(mcp/filesystem)memory(mcp/memory)sqlite(mcp/sqlite)postgres(mcp/postgres)brave-search(mcp/brave-search)fetch(mcp/fetch)puppeteer(mcp/puppeteer)slack(mcp/slack)gdrive(mcp/gdrive)google-maps(mcp/google-maps)everart(mcp/everart)sequential-thinking(mcp/sequential-thinking)aws-kb-retrieval(mcp/aws-kb-retrieval)linear(mcp/linear)sentry(mcp/sentry)raygun(mcp/raygun)git(mcp/git)time(mcp/time)axiom(mcp/axiom)
What Actually Worked ✅
The MCP Gateway behaved correctly:
- Binary compiled successfully
- Configuration parsed correctly (20 servers loaded)
- Server started and bound to port 3000
- Detected AWF environment correctly
- Provided clear, actionable error messages
This is not a gateway bug - it's an environment incompatibility between the test design and AWF constraints.
Resolution Options
Option 1: Run Workflow Outside AWF (Recommended)
Pros:
- No code changes needed
- Tests gateway as designed (with container launching)
- Quick to implement
Cons:
- Less security isolation
- May require different workflow runner
Implementation:
- Modify workflow to run on standard GitHub runner (not AWF container)
- OR: Run workflow on self-hosted runner with Docker access
Option 2: Use HTTP-Based MCP Servers
Pros:
- Servers run outside workflow (no Docker needed)
- Tests gateway's HTTP proxy capabilities
- Maintains security boundary
Cons:
- Requires pre-deployed MCP servers
- Doesn't test gateway's container launching
- Complex infrastructure setup
Implementation:
- Deploy MCP servers externally (e.g., cloud instances)
- Configure stress test with
type: "http"andurlinstead ofcontainer
Option 3: Use Stdio-Based Non-Container Servers
Pros:
- Can run inside AWF
- Tests gateway stdio capabilities
- No Docker dependency
Cons:
- Requires rewriting/rebuilding MCP servers as binaries
- Most MCP servers distributed as containers only
- Significant development effort
Implementation:
- Build or find stdio-compatible MCP server binaries
- Deploy binaries into workflow environment
- Configure with
commandinstead ofcontainer
Option 4: Hybrid Approach
Pros:
- Partial test coverage better than none
- Incremental improvement possible
- Flexible
Cons:
- Incomplete coverage
- Maintains complexity
Implementation:
- Identify which servers can run as stdio processes
- Test subset (e.g., 5-10 servers)
- Document remaining servers as "requires Docker"
Option 5: Disable Stress Test
Pros:
- Acknowledges limitation clearly
- Frees up workflow resources
- Simple
Cons:
- Zero multi-server test coverage
- No regression detection for scaling issues
Implementation:
- Disable
.github/workflows/nightly-mcp-stress-test.mdworkflow - Document as known limitation in README
Recommendations
Immediate Actions
- ✅ Document blocker (this issue)
- 🔲 Disable workflow until resolved (prevents nightly failures)
- 🔲 Evaluate Option 1 (run outside AWF for nightly tests)
Short-Term (1-2 weeks)
- Investigate feasibility of running stress test on non-AWF runner
- If feasible: implement Option 1
- If not: implement Option 4 (hybrid with available stdio servers)
Long-Term (1-2 months)
- Consider Option 2 (pre-deployed HTTP servers) for comprehensive testing
- Evaluate if stress testing is valuable enough to warrant infrastructure
Next Steps
Decision Required: Which resolution option should we pursue?
Once decided, I can:
- Update workflow configuration
- Modify test design
- Create follow-up implementation tasks
Technical Context
AWF PR #205: github/gh-aw-firewall#205
MCP Gateway Config: .github/agentics/nightly-mcp-stress-test.md
Test Session Logs: Available in workflow run artifacts
Labels Suggested:
bug(blocks intended functionality)infrastructure(environment/workflow issue)nightly-tests(affects nightly testing)decision-needed(requires team decision on approach)
AI generated by Nightly MCP Server Stress Test