Add automated Claude workflow for test failure analysis (#200)

Copilot · strawgate · web-flow · commit cacb180ae8a6 · 2025-11-01T21:19:21.000-05:00
Co-authored-by: copilot-swe-agent[bot] &lt;198982749+Copilot@users.noreply.github.com&gt;
Co-authored-by: strawgate &lt;6384545+strawgate@users.noreply.github.com&gt;
Co-authored-by: William Easton &lt;williamseaston@gmail.com&gt;
diff --git a/.github/workflows/claude-on-test-failure.yml b/.github/workflows/claude-on-test-failure.yml
@@ -0,0 +1,163 @@
+name: Claude Test Failure Analysis
+
+on:
+  workflow_run:
+    workflows: ["Run Tests"]
+    types:
+      - completed
+
+concurrency:
+  group: claude-test-failure-${{ github.event.workflow_run.head_branch }}
+  cancel-in-progress: true
+
+jobs:
+  claude-analyze-failure:
+    # Only run if the test workflow failed
+    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      pull-requests: write
+      issues: write
+      id-token: write
+      actions: read # Required for Claude to read CI results
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.10'
+
+      # Install UV package manager
+      - name: Install UV
+        uses: astral-sh/setup-uv@v7
+
+      - name: Set analysis prompt
+        id: analysis-prompt
+        run: |
+          cat >> $GITHUB_OUTPUT << 'EOF'
+          PROMPT<<PROMPT_END
+          You're a test failure analysis assistant for Py-Key-Value, a Python framework for interacting with Key-Value stores.
+
+          # Your Task
+          A GitHub Actions workflow has failed. Your job is to:
+          1. Analyze the test failure(s) to understand what went wrong
+          2. Identify the root cause of the failure(s)
+          3. Suggest a clear, actionable solution to fix the failure(s)
+
+          # Getting Started
+          1. Call the generate_agents_md tool to get a high-level summary of the project
+          2. Get the pull request associated with this workflow run from the GitHub repository: ${{ github.repository }}
+             - The workflow run ID is: ${{ github.event.workflow_run.id }}
+             - The workflow run was triggered by: ${{ github.event.workflow_run.event }}
+             - Use GitHub MCP tools to get PR details and workflow run information
+          3. Use the GitHub MCP tools to fetch job logs and failure information:
+             - Use get_workflow_run to get details about the failed workflow
+             - Use list_workflow_jobs to see which jobs failed
+             - Use get_job_logs with failed_only=true to get logs for failed jobs
+             - Use summarize_run_log_failures to get an AI summary of what failed
+          4. Analyze the failures to understand the root cause
+          5. Search the codebase for relevant files, tests, and implementations
+
+          # Your Response
+          Post a comment on the pull request with your analysis. Your comment should include:
+
+          ## Test Failure Analysis
+
+          **Summary**: A brief 1-2 sentence summary of what failed.
+
+          **Root Cause**: A clear explanation of why the tests failed, based on your analysis of the logs and code.
+
+          **Suggested Solution**: Specific, actionable steps to fix the failure(s). Include:
+          - Which files need to be modified
+          - What changes are needed
+          - Why these changes will fix the issue
+
+          <details>
+          <summary>Detailed Analysis</summary>
+
+          Include here:
+          - Relevant log excerpts showing the failure
+          - Code snippets that are causing the issue
+          - Any related issues or PRs that might be relevant
+          </details>
+
+          <details>
+          <summary>Related Files</summary>
+
+          List files that are relevant to the failure with brief explanations of their relevance.
+          </details>
+
+          # Important Guidelines
+          - Be concise and actionable - developers want to quickly understand and fix the issue
+          - Focus on facts from the logs and code, not speculation
+          - If you can't determine the root cause, say so clearly
+          - Provide specific file names, line numbers, and code references when possible
+          - You can run make commands (e.g., `make lint`, `make typecheck`, `make sync`) to build, test, or lint the code
+          - You can also run git commands (e.g., `git status`, `git log`, `git diff`) to inspect the repository
+          - You can use WebSearch and WebFetch to research errors, stack traces, or related issues
+          - For bash commands, you are limited to make and git commands only
+
+          # CRITICAL: Loop Detection
+          **IMPORTANT**: Before posting your analysis, check the PR comments to detect if there's a loop where:
+          - CodeRabbit or another bot triggered this workflow
+          - Your previous analysis triggered CodeRabbit or another bot
+          - This created a repeating cycle of bot comments
+
+          If you detect such a loop (e.g., you see multiple similar bot comments or your own previous analysis comments):
+          1. **DO NOT** post another analysis comment
+          2. Instead, post a brief comment stating: "Loop detected: Multiple automated analysis comments found. Stopping to prevent further automated comments. Please review the existing analysis comments."
+          3. Exit immediately without further action
+
+          # Problems Encountered
+          If you encounter any problems during your analysis (e.g., unable to fetch logs, tools not working), document them clearly so the team knows what limitations you faced.
+          PROMPT_END
+          EOF
+
+      - name: Setup GitHub MCP Server
+        run: |
+          mkdir -p /tmp/mcp-config
+          cat > /tmp/mcp-config/mcp-servers.json << 'EOF'
+          {
+            "mcpServers": {
+              "repository-summary": {
+                "type": "http",
+                "url": "https://agents-md-generator.fastmcp.app/mcp"
+              },
+              "code-search": {
+                "type": "http",
+                "url": "https://public-code-search.fastmcp.app/mcp"
+              },
+              "github-research": {
+                "type": "stdio",
+                "command": "uvx",
+                "args": [
+                  "github-research-mcp"
+                ],
+                "env": {
+                  "DISABLE_SUMMARIES": "true",  # Disable verbose summaries for faster analysis
+                  "GITHUB_PERSONAL_ACCESS_TOKEN": "${{ secrets.GITHUB_TOKEN }}"
+                }
+              }
+            }
+          }
+          EOF
+
+      - name: Run Claude Code
+        id: claude
+        uses: anthropics/claude-code-action@v1
+        with:
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
+
+          additional_permissions: |
+            actions: read
+
+          prompt: ${{ steps.analysis-prompt.outputs.PROMPT }}
+          track_progress: true
+          claude_args: |
+            --allowed-tools mcp__repository-summary,mcp__code-search,mcp__github-research,WebSearch,WebFetch,Bash(make:*,git:*)
+            --mcp-config /tmp/mcp-config/mcp-servers.json
diff --git a/AGENTS.md b/AGENTS.md
@@ -224,6 +224,8 @@ GitHub Actions workflows are in `.github/workflows/`:
 - `publish.yml` - Publish packages to PyPI
 - `claude-on-mention.yml` - Claude Code assistant (can make PRs)
 - `claude-on-open-label.yml` - Claude triage assistant (read-only analysis)
+- `claude-on-test-failure.yml` - Claude test failure analysis (automatically
+  analyzes failed tests and suggests solutions)
 
 ## Version Management