Skip to content

Fix Research workflow - Critical failure (90% failure rate) #11434

@github-actions

Description

@github-actions

Problem

The Research workflow is experiencing critical failures with a 90% failure rate (9/10 recent runs failed). The workflow has been essentially non-operational since 2026-01-08 (15 days ago).

Error Details

Last successful run: 2026-01-08
Recent failures: Multiple failures throughout January 2026
Latest failed run: §21078189533

Impact

  • Research and knowledge work capabilities are severely limited
  • Cannot perform automated research tasks
  • Blocks workflows that depend on research functionality

Root Cause Analysis

Based on pattern analysis with similar failing workflows:

  1. Likely Tavily MCP server issue - Similar to MCP Inspector and the recovered Daily News workflow
  2. MCP Gateway startup failure - Expected to fail at "Start MCP gateway" step
  3. Secret configuration - May need TAVILY_API_KEY or related credentials

Related Context

Pattern detected across Tavily-dependent workflows:

  • ✅ Daily News: RECOVERED after TAVILY_API_KEY added (2026-01-22)
  • ❌ MCP Inspector: FAILING (80% rate) - same MCP Gateway issue
  • ❌ Research: FAILING (90% rate) - likely same root cause
  • ⚠️ Scout: Status unknown (also uses Tavily)

Recommended Investigation Steps

  1. Check workflow configuration:

    • Review .github/workflows/research.md frontmatter
    • Verify MCP server configuration (particularly Tavily)
    • Compare with Daily News (now working) and MCP Inspector (failing)
  2. Verify secrets and environment:

    • Confirm TAVILY_API_KEY is set and accessible
    • Check if any other secrets are required
    • Review repository secrets configuration
  3. Analyze recent failed runs:

    • Download artifacts from run 21078189533
    • Check MCP Gateway logs for specific error messages
    • Identify which step is failing (likely "Start MCP gateway")
  4. Test workflow manually:

    • Trigger workflow run manually to reproduce issue
    • Add debug logging if needed
    • Compare behavior with working workflows (Smoke tests)

Success Criteria

  • Research workflow runs successfully
  • Success rate returns to >80% over next 5 runs
  • Research functionality fully operational

Priority: P1 (High)

Research capabilities are critical for knowledge work and automated analysis. This workflow needs urgent attention to restore functionality.

References:

AI generated by Workflow Health Manager - Meta-Orchestrator

  • expires on Jan 24, 2026, 2:59 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions