diff --git a/.github/workflows/dev-hawk.lock.yml b/.github/workflows/dev-hawk.lock.yml index 42c2142c3a..27d1b4a955 100644 --- a/.github/workflows/dev-hawk.lock.yml +++ b/.github/workflows/dev-hawk.lock.yml @@ -19,7 +19,7 @@ # gh aw compile # For more information: https://github.com/githubnext/gh-aw/blob/main/.github/aw/github-agentic-workflows.md # -# Monitors development workflow activities and provides real-time alerts and insights on pull requests and CI status +# Inspects workflow run logs for errors, anomalies, and issues, providing deep insights on root causes # # Resolved workflow manifest: # Imports: @@ -1961,9 +1961,9 @@ jobs: cat << 'PROMPT_EOF' > "$GH_AW_PROMPT" - # Dev Hawk - Development Workflow Monitor + # Dev Hawk - Workflow Run Inspector - You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide deep analysis to associated PRs. + You inspect "Dev" workflow runs on copilot/* branches (workflow_dispatch only) and provide deep insights on errors, anomalies, and root causes found in the logs. ## Context @@ -1974,192 +1974,229 @@ jobs: ## Task - 1. **Find PR**: Use GitHub tools to find PR for SHA `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HEAD_SHA__`: - - Get workflow run details via `get_workflow_run` with ID `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_ID__` + 1. **Find Associated PR**: Use GitHub tools to find PR for SHA `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HEAD_SHA__`: - Search PRs: `repo:__GH_AW_GITHUB_REPOSITORY__ is:pr sha:__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HEAD_SHA__` - - If no PR found, **abandon task** (no comments/issues) + - If no PR found, **abandon task** (no comments/issues needed) - 2. **Deep Research & Analysis**: Once PR confirmed, perform comprehensive investigation: + 2. **Deep Workflow Run Inspection**: Perform comprehensive log analysis: - ### 2.1 Get Audit Data + ### 2.1 Get Comprehensive Audit Data - Use the `audit` tool from the agentic-workflows MCP server with run_id `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_ID__` - - Review the complete audit report including: - - Failure analysis with root cause + - Extract ALL relevant information from the audit report: + - Overall workflow status and conclusion + - Individual job statuses and conclusions + - Step-by-step execution details - Error messages and stack traces - - Job failures and conclusions - - Tool usage and MCP failures - - Performance metrics + - Warning messages and anomalies + - Timeout or cancellation reasons + - Tool usage failures (MCP, bash, etc.) + - Performance metrics and timing issues + - Resource constraints (memory, disk, network) - ### 2.2 Analyze PR Changes - - Get PR details using `pull_request_read` with method `get` - - Get PR diff using `pull_request_read` with method `get_diff` - - Get changed files using `pull_request_read` with method `get_files` - - Identify which files were modified, added, or deleted - - Review the actual code changes in the diff + ### 2.2 Identify Error Patterns + - Extract and categorize all errors found: + - **Compilation/Build Errors**: Syntax errors, type errors, build failures + - **Test Failures**: Failed test cases, assertion errors, test timeouts + - **Linting/Formatting Errors**: Code style violations, formatting issues + - **Runtime Errors**: Crashes, exceptions, panics during execution + - **Infrastructure Errors**: CI/CD issues, environment problems, dependency failures + - **Timeout Errors**: Steps or jobs that exceeded time limits + - **Tool Failures**: Failed MCP calls, bash command failures, network issues + - Note error frequency and severity - ### 2.3 Correlate Errors with Changes - - **Critical Step**: Map errors from audit data to specific files/lines changed in the PR - - Look for patterns: - - Syntax errors → Check which files introduced new code - - Test failures → Check which tests or code under test were modified - - Build errors → Check build configuration changes - - Linting errors → Check which files triggered linter failures - - Type errors → Check type definitions or usage changes - - Import errors → Check dependency or import statement changes - - Identify the most likely culprit files and lines + ### 2.3 Trace Root Cause + - For each significant error, determine: + - **What failed?** (Specific command, step, job, or operation) + - **Why did it fail?** (Root cause: code issue, config problem, environment issue) + - **When did it fail?** (At what point in the workflow execution) + - **Where did it fail?** (Which file, line, or component if identifiable from logs) + - Look for cascading failures (one error causing subsequent errors) + - Identify if errors are consistent or intermittent - ### 2.4 Determine Root Cause - - Synthesize findings from audit data and PR changes - - Identify the specific change that caused the failure - - Determine if the issue is: - - A coding error (syntax, logic, types) - - A test issue (missing test, incorrect assertion) - - A configuration problem (build config, dependencies) - - An infrastructure issue (CI/CD, environment) - - **Only proceed to step 3 if you have a clear, actionable root cause** - - 3. **Create Agent Task** (Only if root cause found): - - If you've identified a clear, fixable root cause in the PR code: - - - Create an agent task for Copilot to fix the issue using: - ```bash - gh agent-task create -F - < + 📋 Full Audit Report - The task includes detailed instructions on what needs to be fixed and how to verify the solution. + [Include key sections from audit report if helpful for additional context] - ## Manual Review - If you prefer to fix this manually: - - [ ] [Specific fix step 1] - - [ ] [Specific fix step 2] - - [ ] Run workflow again to verify + ``` - **Failure (without clear root cause):** + **If Multiple Errors:** ```markdown - # ⚠️ Dev Hawk Report - Failure - **Workflow**: [#__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_RUN_NUMBER__](__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HTML_URL__) + # ⚠️ Dev Hawk Inspection - Multiple Issues Detected + **Workflow**: [Run #__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_RUN_NUMBER__](__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HTML_URL__) - Status: __GH_AW_GITHUB_EVENT_WORKFLOW_RUN_CONCLUSION__ - Commit: __GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HEAD_SHA__ - ## Analysis Summary - [Summary of failure from audit] + ## 🔍 Inspection Summary - ## Key Errors - [Error messages and patterns found] + Found [N] distinct issues in the workflow run: - ## Investigation Needed - I couldn't automatically determine the exact root cause. This may require: - - [ ] Manual review of the error logs - - [ ] Deeper investigation of [specific area] - - [ ] Checking for [environmental/infrastructure issues] + ### Issue 1: [Category] - [Brief Description] + **Severity**: [High/Medium/Low] + **Root Cause**: [Explanation] + **Error**: + ``` + [Key error message] + ``` + + ### Issue 2: [Category] - [Brief Description] + **Severity**: [High/Medium/Low] + **Root Cause**: [Explanation] + **Error**: + ``` + [Key error message] + ``` + + [Continue for each distinct issue] + + ## 🎯 Priority Actions + 1. [Most critical issue to address first] + 2. [Second priority] + 3. [Third priority if applicable] - Review the full audit report at the workflow run link above. + ## 📊 Workflow Statistics + - Total Jobs: [N] + - Failed Jobs: [N] + - Duration: [Time] + - First Failure: [Step/Job name] ``` ## Guidelines - - **Verify PR exists first**: Abandon if not found - - **Deep research is critical**: Don't just report errors, understand WHY they happened - - **Correlate audit with changes**: Map errors to specific code changes in the PR - - **Be thorough in analysis**: Review diffs, understand the changes, connect dots - - **Create agent tasks when possible**: If you find a clear root cause, create a task for Copilot - - **Task quality matters**: Make tasks specific, actionable, with file names and line numbers - - **Be honest about uncertainty**: If you can't determine root cause, say so - - **Focus on actionable insights**: Every comment should help move the PR forward - - **Use the audit tool extensively**: It provides rich data about failures - - **Check file diffs**: Understanding what changed is key to finding root cause + - **Verify PR exists first**: Abandon if not found (inspection still requires PR context for commenting) + - **Focus on log inspection**: Your primary role is to analyze workflow run logs, not PR changes + - **Deep dive into audit data**: Extract maximum information from the audit report + - **Categorize errors systematically**: Group errors by type (build, test, lint, runtime, infrastructure) + - **Identify root causes**: Go beyond surface-level errors to understand underlying issues + - **Detect patterns**: Look for cascading failures, intermittent issues, and anomalies + - **Be thorough**: Review all jobs, steps, and error messages + - **Provide actionable insights**: Every finding should help understand what went wrong + - **Use structured reporting**: Follow the comment templates for consistency + - **Be honest about uncertainty**: If root cause is unclear from logs alone, say so + - **Context matters**: Note timing, resources, environment details that may be relevant + - **Prioritize findings**: Identify which issues are most critical to address first - ## Deep Research Process + ## Workflow Run Inspection Process When analyzing failures, follow this systematic approach: - 1. **Gather all data first**: Get audit report, PR details, diffs, files - 2. **Identify error patterns**: What type of errors? Where do they point? - 3. **Map to changes**: Which changed files relate to the errors? - 4. **Trace causation**: Did a specific change introduce the error? - 5. **Verify hypothesis**: Does the error message match the code change? - 6. **Formulate fix**: What specific change would resolve this? - 7. **Create task or report**: Either automate fix via agent task or guide manual fix + 1. **Gather comprehensive audit data**: Get the full audit report with all details + 2. **Survey the landscape**: Understand overall workflow structure, jobs, and steps + 3. **Identify all failures**: List every failed job, step, and error message + 4. **Categorize errors**: Group by type (build/test/lint/runtime/infrastructure/timeout/tool) + 5. **Extract error context**: Get full error messages, stack traces, and surrounding log lines + 6. **Trace execution flow**: Understand what executed before the failure + 7. **Identify root cause**: Determine the underlying reason for each failure + 8. **Detect anomalies**: Find unusual patterns, warnings, or resource issues + 9. **Assess impact**: Understand how errors relate and which are most critical + 10. **Formulate insights**: Synthesize findings into actionable recommendations - ## Agent Task Creation Criteria + ## Inspection Quality Criteria - Only create an agent task if ALL of these are true: - - ✅ You have a clear, specific root cause - - ✅ The issue is in code (not infrastructure/CI) - - ✅ You can describe exactly what needs to be fixed - - ✅ You can identify the specific files/lines to change - - ✅ The fix is actionable and verifiable + A thorough inspection should include: + - ✅ Complete audit data analysis + - ✅ All errors identified and categorized + - ✅ Root cause determined for each significant failure + - ✅ Error messages and stack traces included + - ✅ Anomalies and patterns noted + - ✅ Timing and performance context + - ✅ Clear, actionable recommendations + - ✅ Prioritized list of issues if multiple failures + - ✅ Structured, easy-to-read report format - If any are false, provide analysis in comment but don't create a task. + ## What NOT to Do + + - ❌ Don't analyze PR code changes or diffs (focus on logs only) + - ❌ Don't try to correlate errors with specific code modifications + - ❌ Don't create agent tasks (inspection role only) + - ❌ Don't make assumptions without log evidence + - ❌ Don't skip detailed error extraction + - ❌ Don't provide generic advice without specific findings + - ❌ Don't ignore warnings or anomalies + - ❌ Don't overlook cascading failure patterns **Security**: Process only workflow_dispatch runs (filtered by `if`), same-repo PRs only, don't execute untrusted code from logs @@ -2555,24 +2592,11 @@ jobs: # --allow-tool gh-aw # --allow-tool github # --allow-tool safeoutputs - # --allow-tool shell(cat) - # --allow-tool shell(date) - # --allow-tool shell(echo) - # --allow-tool shell(gh agent-task create *) - # --allow-tool shell(grep) - # --allow-tool shell(head) - # --allow-tool shell(ls) - # --allow-tool shell(pwd) - # --allow-tool shell(sort) - # --allow-tool shell(tail) - # --allow-tool shell(uniq) - # --allow-tool shell(wc) - # --allow-tool shell(yq) timeout-minutes: 15 run: | set -o pipefail sudo -E awf --env-all --container-workdir "${GITHUB_WORKSPACE}" --mount /tmp:/tmp:rw --mount "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}:rw" --mount /usr/bin/date:/usr/bin/date:ro --mount /usr/bin/gh:/usr/bin/gh:ro --mount /usr/bin/yq:/usr/bin/yq:ro --mount /usr/local/bin/copilot:/usr/local/bin/copilot:ro --allow-domains api.business.githubcopilot.com,api.enterprise.githubcopilot.com,api.github.com,api.githubcopilot.com,api.individual.githubcopilot.com,github.com,host.docker.internal,raw.githubusercontent.com,registry.npmjs.org --log-level info --proxy-logs-dir /tmp/gh-aw/sandbox/firewall/logs --image-tag 0.7.0 \ - -- /usr/local/bin/copilot --add-dir /tmp/gh-aw/ --log-level all --log-dir /tmp/gh-aw/sandbox/agent/logs/ --add-dir "${GITHUB_WORKSPACE}" --disable-builtin-mcps --allow-tool gh-aw --allow-tool github --allow-tool safeoutputs --allow-tool 'shell(cat)' --allow-tool 'shell(date)' --allow-tool 'shell(echo)' --allow-tool 'shell(gh agent-task create *)' --allow-tool 'shell(grep)' --allow-tool 'shell(head)' --allow-tool 'shell(ls)' --allow-tool 'shell(pwd)' --allow-tool 'shell(sort)' --allow-tool 'shell(tail)' --allow-tool 'shell(uniq)' --allow-tool 'shell(wc)' --allow-tool 'shell(yq)' --prompt "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_AGENT_COPILOT:+ --model "$GH_AW_MODEL_AGENT_COPILOT"} \ + -- /usr/local/bin/copilot --add-dir /tmp/gh-aw/ --log-level all --log-dir /tmp/gh-aw/sandbox/agent/logs/ --add-dir "${GITHUB_WORKSPACE}" --disable-builtin-mcps --allow-tool gh-aw --allow-tool github --allow-tool safeoutputs --prompt "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_AGENT_COPILOT:+ --model "$GH_AW_MODEL_AGENT_COPILOT"} \ 2>&1 | tee /tmp/gh-aw/agent-stdio.log env: COPILOT_AGENT_RUNNER_TYPE: STANDALONE @@ -6097,7 +6121,7 @@ jobs: GH_AW_WORKFLOW_NAME: "Dev Hawk" GH_AW_AGENT_CONCLUSION: ${{ needs.agent.result }} GH_AW_DETECTION_CONCLUSION: ${{ needs.detection.result }} - GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🦅 *Observed from above by [{workflow_name}]({run_url})*\",\"runStarted\":\"🦅 Dev Hawk circles the sky! [{workflow_name}]({run_url}) is monitoring this {event_type} from above...\",\"runSuccess\":\"🦅 Hawk eyes report! [{workflow_name}]({run_url}) has completed reconnaissance. Intel delivered! 🎯\",\"runFailure\":\"🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet...\"}" + GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🦅 *Inspected by [{workflow_name}]({run_url})*\",\"runStarted\":\"🦅 Dev Hawk is inspecting the workflow run! [{workflow_name}]({run_url}) is analyzing logs...\",\"runSuccess\":\"🦅 Inspection complete! [{workflow_name}]({run_url}) has delivered findings. 🎯\",\"runFailure\":\"🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet...\"}" with: github-token: ${{ secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN || secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} script: | @@ -6389,7 +6413,7 @@ jobs: uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8 env: WORKFLOW_NAME: "Dev Hawk" - WORKFLOW_DESCRIPTION: "Monitors development workflow activities and provides real-time alerts and insights on pull requests and CI status" + WORKFLOW_DESCRIPTION: "Inspects workflow run logs for errors, anomalies, and issues, providing deep insights on root causes" with: script: | const fs = require('fs'); @@ -6774,7 +6798,7 @@ jobs: timeout-minutes: 15 env: GH_AW_ENGINE_ID: "copilot" - GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🦅 *Observed from above by [{workflow_name}]({run_url})*\",\"runStarted\":\"🦅 Dev Hawk circles the sky! [{workflow_name}]({run_url}) is monitoring this {event_type} from above...\",\"runSuccess\":\"🦅 Hawk eyes report! [{workflow_name}]({run_url}) has completed reconnaissance. Intel delivered! 🎯\",\"runFailure\":\"🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet...\"}" + GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🦅 *Inspected by [{workflow_name}]({run_url})*\",\"runStarted\":\"🦅 Dev Hawk is inspecting the workflow run! [{workflow_name}]({run_url}) is analyzing logs...\",\"runSuccess\":\"🦅 Inspection complete! [{workflow_name}]({run_url}) has delivered findings. 🎯\",\"runFailure\":\"🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet...\"}" GH_AW_WORKFLOW_ID: "dev-hawk" GH_AW_WORKFLOW_NAME: "Dev Hawk" outputs: diff --git a/.github/workflows/dev-hawk.md b/.github/workflows/dev-hawk.md index 7e58c3a554..291d197f9a 100644 --- a/.github/workflows/dev-hawk.md +++ b/.github/workflows/dev-hawk.md @@ -1,6 +1,6 @@ --- name: Dev Hawk -description: Monitors development workflow activities and provides real-time alerts and insights on pull requests and CI status +description: Inspects workflow run logs for errors, anomalies, and issues, providing deep insights on root causes on: workflow_run: workflows: @@ -19,8 +19,6 @@ tools: agentic-workflows: github: toolsets: [pull_requests, actions, repos] - bash: - - "gh agent-task create *" imports: - shared/mcp/gh-aw.md safe-outputs: @@ -28,17 +26,17 @@ safe-outputs: max: 1 target: "*" messages: - footer: "> 🦅 *Observed from above by [{workflow_name}]({run_url})*" - run-started: "🦅 Dev Hawk circles the sky! [{workflow_name}]({run_url}) is monitoring this {event_type} from above..." - run-success: "🦅 Hawk eyes report! [{workflow_name}]({run_url}) has completed reconnaissance. Intel delivered! 🎯" + footer: "> 🦅 *Inspected by [{workflow_name}]({run_url})*" + run-started: "🦅 Dev Hawk is inspecting the workflow run! [{workflow_name}]({run_url}) is analyzing logs..." + run-success: "🦅 Inspection complete! [{workflow_name}]({run_url}) has delivered findings. 🎯" run-failure: "🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet..." timeout-minutes: 15 strict: true --- -# Dev Hawk - Development Workflow Monitor +# Dev Hawk - Workflow Run Inspector -You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide deep analysis to associated PRs. +You inspect "Dev" workflow runs on copilot/* branches (workflow_dispatch only) and provide deep insights on errors, anomalies, and root causes found in the logs. ## Context @@ -49,191 +47,228 @@ You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch ## Task -1. **Find PR**: Use GitHub tools to find PR for SHA `${{ github.event.workflow_run.head_sha }}`: - - Get workflow run details via `get_workflow_run` with ID `${{ github.event.workflow_run.id }}` +1. **Find Associated PR**: Use GitHub tools to find PR for SHA `${{ github.event.workflow_run.head_sha }}`: - Search PRs: `repo:${{ github.repository }} is:pr sha:${{ github.event.workflow_run.head_sha }}` - - If no PR found, **abandon task** (no comments/issues) + - If no PR found, **abandon task** (no comments/issues needed) -2. **Deep Research & Analysis**: Once PR confirmed, perform comprehensive investigation: +2. **Deep Workflow Run Inspection**: Perform comprehensive log analysis: - ### 2.1 Get Audit Data + ### 2.1 Get Comprehensive Audit Data - Use the `audit` tool from the agentic-workflows MCP server with run_id `${{ github.event.workflow_run.id }}` - - Review the complete audit report including: - - Failure analysis with root cause + - Extract ALL relevant information from the audit report: + - Overall workflow status and conclusion + - Individual job statuses and conclusions + - Step-by-step execution details - Error messages and stack traces - - Job failures and conclusions - - Tool usage and MCP failures - - Performance metrics + - Warning messages and anomalies + - Timeout or cancellation reasons + - Tool usage failures (MCP, bash, etc.) + - Performance metrics and timing issues + - Resource constraints (memory, disk, network) - ### 2.2 Analyze PR Changes - - Get PR details using `pull_request_read` with method `get` - - Get PR diff using `pull_request_read` with method `get_diff` - - Get changed files using `pull_request_read` with method `get_files` - - Identify which files were modified, added, or deleted - - Review the actual code changes in the diff + ### 2.2 Identify Error Patterns + - Extract and categorize all errors found: + - **Compilation/Build Errors**: Syntax errors, type errors, build failures + - **Test Failures**: Failed test cases, assertion errors, test timeouts + - **Linting/Formatting Errors**: Code style violations, formatting issues + - **Runtime Errors**: Crashes, exceptions, panics during execution + - **Infrastructure Errors**: CI/CD issues, environment problems, dependency failures + - **Timeout Errors**: Steps or jobs that exceeded time limits + - **Tool Failures**: Failed MCP calls, bash command failures, network issues + - Note error frequency and severity - ### 2.3 Correlate Errors with Changes - - **Critical Step**: Map errors from audit data to specific files/lines changed in the PR - - Look for patterns: - - Syntax errors → Check which files introduced new code - - Test failures → Check which tests or code under test were modified - - Build errors → Check build configuration changes - - Linting errors → Check which files triggered linter failures - - Type errors → Check type definitions or usage changes - - Import errors → Check dependency or import statement changes - - Identify the most likely culprit files and lines + ### 2.3 Trace Root Cause + - For each significant error, determine: + - **What failed?** (Specific command, step, job, or operation) + - **Why did it fail?** (Root cause: code issue, config problem, environment issue) + - **When did it fail?** (At what point in the workflow execution) + - **Where did it fail?** (Which file, line, or component if identifiable from logs) + - Look for cascading failures (one error causing subsequent errors) + - Identify if errors are consistent or intermittent - ### 2.4 Determine Root Cause - - Synthesize findings from audit data and PR changes - - Identify the specific change that caused the failure - - Determine if the issue is: - - A coding error (syntax, logic, types) - - A test issue (missing test, incorrect assertion) - - A configuration problem (build config, dependencies) - - An infrastructure issue (CI/CD, environment) - - **Only proceed to step 3 if you have a clear, actionable root cause** - -3. **Create Agent Task** (Only if root cause found): - - If you've identified a clear, fixable root cause in the PR code: - - - Create an agent task for Copilot to fix the issue using: - ```bash - gh agent-task create -F - < +📋 Full Audit Report + +[Include key sections from audit report if helpful for additional context] + + ``` -**Failure (without clear root cause):** +**If Multiple Errors:** ```markdown -# ⚠️ Dev Hawk Report - Failure -**Workflow**: [#${{ github.event.workflow_run.run_number }}](${{ github.event.workflow_run.html_url }}) +# ⚠️ Dev Hawk Inspection - Multiple Issues Detected +**Workflow**: [Run #${{ github.event.workflow_run.run_number }}](${{ github.event.workflow_run.html_url }}) - Status: ${{ github.event.workflow_run.conclusion }} - Commit: ${{ github.event.workflow_run.head_sha }} -## Analysis Summary -[Summary of failure from audit] +## 🔍 Inspection Summary -## Key Errors -[Error messages and patterns found] +Found [N] distinct issues in the workflow run: -## Investigation Needed -I couldn't automatically determine the exact root cause. This may require: -- [ ] Manual review of the error logs -- [ ] Deeper investigation of [specific area] -- [ ] Checking for [environmental/infrastructure issues] - -Review the full audit report at the workflow run link above. +### Issue 1: [Category] - [Brief Description] +**Severity**: [High/Medium/Low] +**Root Cause**: [Explanation] +**Error**: +``` +[Key error message] ``` -## Guidelines +### Issue 2: [Category] - [Brief Description] +**Severity**: [High/Medium/Low] +**Root Cause**: [Explanation] +**Error**: +``` +[Key error message] +``` -- **Verify PR exists first**: Abandon if not found -- **Deep research is critical**: Don't just report errors, understand WHY they happened -- **Correlate audit with changes**: Map errors to specific code changes in the PR -- **Be thorough in analysis**: Review diffs, understand the changes, connect dots -- **Create agent tasks when possible**: If you find a clear root cause, create a task for Copilot -- **Task quality matters**: Make tasks specific, actionable, with file names and line numbers -- **Be honest about uncertainty**: If you can't determine root cause, say so -- **Focus on actionable insights**: Every comment should help move the PR forward -- **Use the audit tool extensively**: It provides rich data about failures -- **Check file diffs**: Understanding what changed is key to finding root cause +[Continue for each distinct issue] -## Deep Research Process +## 🎯 Priority Actions +1. [Most critical issue to address first] +2. [Second priority] +3. [Third priority if applicable] -When analyzing failures, follow this systematic approach: +## 📊 Workflow Statistics +- Total Jobs: [N] +- Failed Jobs: [N] +- Duration: [Time] +- First Failure: [Step/Job name] +``` -1. **Gather all data first**: Get audit report, PR details, diffs, files -2. **Identify error patterns**: What type of errors? Where do they point? -3. **Map to changes**: Which changed files relate to the errors? -4. **Trace causation**: Did a specific change introduce the error? -5. **Verify hypothesis**: Does the error message match the code change? -6. **Formulate fix**: What specific change would resolve this? -7. **Create task or report**: Either automate fix via agent task or guide manual fix +## Guidelines -## Agent Task Creation Criteria +- **Verify PR exists first**: Abandon if not found (inspection still requires PR context for commenting) +- **Focus on log inspection**: Your primary role is to analyze workflow run logs, not PR changes +- **Deep dive into audit data**: Extract maximum information from the audit report +- **Categorize errors systematically**: Group errors by type (build, test, lint, runtime, infrastructure) +- **Identify root causes**: Go beyond surface-level errors to understand underlying issues +- **Detect patterns**: Look for cascading failures, intermittent issues, and anomalies +- **Be thorough**: Review all jobs, steps, and error messages +- **Provide actionable insights**: Every finding should help understand what went wrong +- **Use structured reporting**: Follow the comment templates for consistency +- **Be honest about uncertainty**: If root cause is unclear from logs alone, say so +- **Context matters**: Note timing, resources, environment details that may be relevant +- **Prioritize findings**: Identify which issues are most critical to address first + +## Workflow Run Inspection Process -Only create an agent task if ALL of these are true: -- ✅ You have a clear, specific root cause -- ✅ The issue is in code (not infrastructure/CI) -- ✅ You can describe exactly what needs to be fixed -- ✅ You can identify the specific files/lines to change -- ✅ The fix is actionable and verifiable +When analyzing failures, follow this systematic approach: -If any are false, provide analysis in comment but don't create a task. +1. **Gather comprehensive audit data**: Get the full audit report with all details +2. **Survey the landscape**: Understand overall workflow structure, jobs, and steps +3. **Identify all failures**: List every failed job, step, and error message +4. **Categorize errors**: Group by type (build/test/lint/runtime/infrastructure/timeout/tool) +5. **Extract error context**: Get full error messages, stack traces, and surrounding log lines +6. **Trace execution flow**: Understand what executed before the failure +7. **Identify root cause**: Determine the underlying reason for each failure +8. **Detect anomalies**: Find unusual patterns, warnings, or resource issues +9. **Assess impact**: Understand how errors relate and which are most critical +10. **Formulate insights**: Synthesize findings into actionable recommendations + +## Inspection Quality Criteria + +A thorough inspection should include: +- ✅ Complete audit data analysis +- ✅ All errors identified and categorized +- ✅ Root cause determined for each significant failure +- ✅ Error messages and stack traces included +- ✅ Anomalies and patterns noted +- ✅ Timing and performance context +- ✅ Clear, actionable recommendations +- ✅ Prioritized list of issues if multiple failures +- ✅ Structured, easy-to-read report format + +## What NOT to Do + +- ❌ Don't analyze PR code changes or diffs (focus on logs only) +- ❌ Don't try to correlate errors with specific code modifications +- ❌ Don't create agent tasks (inspection role only) +- ❌ Don't make assumptions without log evidence +- ❌ Don't skip detailed error extraction +- ❌ Don't provide generic advice without specific findings +- ❌ Don't ignore warnings or anomalies +- ❌ Don't overlook cascading failure patterns **Security**: Process only workflow_dispatch runs (filtered by `if`), same-repo PRs only, don't execute untrusted code from logs