diff --git a/.github/workflows/dev-hawk.lock.yml b/.github/workflows/dev-hawk.lock.yml index 86ac0c8eb7..6d0183d594 100644 --- a/.github/workflows/dev-hawk.lock.yml +++ b/.github/workflows/dev-hawk.lock.yml @@ -1963,7 +1963,7 @@ jobs: # Dev Hawk - Development Workflow Monitor - You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide analysis to associated PRs. + You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide deep analysis to associated PRs. ## Context @@ -1979,13 +1979,87 @@ jobs: - Search PRs: `repo:__GH_AW_GITHUB_REPOSITORY__ is:pr sha:__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_HEAD_SHA__` - If no PR found, **abandon task** (no comments/issues) - 2. **Analyze**: Once PR confirmed: - - Get workflow details, status, execution time - - For failures: Use the `audit` tool from the agentic-workflows MCP server with run_id `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_ID__` - - Categorize: code issues, infrastructure, dependencies, config, timeouts - - Extract error messages and patterns + 2. **Deep Research & Analysis**: Once PR confirmed, perform comprehensive investigation: + + ### 2.1 Get Audit Data + - Use the `audit` tool from the agentic-workflows MCP server with run_id `__GH_AW_GITHUB_EVENT_WORKFLOW_RUN_ID__` + - Review the complete audit report including: + - Failure analysis with root cause + - Error messages and stack traces + - Job failures and conclusions + - Tool usage and MCP failures + - Performance metrics + + ### 2.2 Analyze PR Changes + - Get PR details using `pull_request_read` with method `get` + - Get PR diff using `pull_request_read` with method `get_diff` + - Get changed files using `pull_request_read` with method `get_files` + - Identify which files were modified, added, or deleted + - Review the actual code changes in the diff + + ### 2.3 Correlate Errors with Changes + - **Critical Step**: Map errors from audit data to specific files/lines changed in the PR + - Look for patterns: + - Syntax errors → Check which files introduced new code + - Test failures → Check which tests or code under test were modified + - Build errors → Check build configuration changes + - Linting errors → Check which files triggered linter failures + - Type errors → Check type definitions or usage changes + - Import errors → Check dependency or import statement changes + - Identify the most likely culprit files and lines + + ### 2.4 Determine Root Cause + - Synthesize findings from audit data and PR changes + - Identify the specific change that caused the failure + - Determine if the issue is: + - A coding error (syntax, logic, types) + - A test issue (missing test, incorrect assertion) + - A configuration problem (build config, dependencies) + - An infrastructure issue (CI/CD, environment) + - **Only proceed to step 3 if you have a clear, actionable root cause** - 3. **Comment on PR**: + 3. **Create Agent Task** (Only if root cause found): + + If you've identified a clear, fixable root cause in the PR code: + + - Create an agent task for Copilot to fix the issue using: + ```bash + gh agent-task create -F - <&1 | tee /tmp/gh-aw/agent-stdio.log env: COPILOT_AGENT_RUNNER_TYPE: STANDALONE diff --git a/.github/workflows/dev-hawk.md b/.github/workflows/dev-hawk.md index 3906079644..7e58c3a554 100644 --- a/.github/workflows/dev-hawk.md +++ b/.github/workflows/dev-hawk.md @@ -19,6 +19,8 @@ tools: agentic-workflows: github: toolsets: [pull_requests, actions, repos] + bash: + - "gh agent-task create *" imports: - shared/mcp/gh-aw.md safe-outputs: @@ -30,13 +32,13 @@ safe-outputs: run-started: "🦅 Dev Hawk circles the sky! [{workflow_name}]({run_url}) is monitoring this {event_type} from above..." run-success: "🦅 Hawk eyes report! [{workflow_name}]({run_url}) has completed reconnaissance. Intel delivered! 🎯" run-failure: "🦅 Hawk down! [{workflow_name}]({run_url}) {status}. The skies grow quiet..." -timeout-minutes: 10 +timeout-minutes: 15 strict: true --- # Dev Hawk - Development Workflow Monitor -You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide analysis to associated PRs. +You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch only) and provide deep analysis to associated PRs. ## Context @@ -52,13 +54,87 @@ You monitor "Dev" workflow completions on copilot/* branches (workflow_dispatch - Search PRs: `repo:${{ github.repository }} is:pr sha:${{ github.event.workflow_run.head_sha }}` - If no PR found, **abandon task** (no comments/issues) -2. **Analyze**: Once PR confirmed: - - Get workflow details, status, execution time - - For failures: Use the `audit` tool from the agentic-workflows MCP server with run_id `${{ github.event.workflow_run.id }}` - - Categorize: code issues, infrastructure, dependencies, config, timeouts - - Extract error messages and patterns - -3. **Comment on PR**: +2. **Deep Research & Analysis**: Once PR confirmed, perform comprehensive investigation: + + ### 2.1 Get Audit Data + - Use the `audit` tool from the agentic-workflows MCP server with run_id `${{ github.event.workflow_run.id }}` + - Review the complete audit report including: + - Failure analysis with root cause + - Error messages and stack traces + - Job failures and conclusions + - Tool usage and MCP failures + - Performance metrics + + ### 2.2 Analyze PR Changes + - Get PR details using `pull_request_read` with method `get` + - Get PR diff using `pull_request_read` with method `get_diff` + - Get changed files using `pull_request_read` with method `get_files` + - Identify which files were modified, added, or deleted + - Review the actual code changes in the diff + + ### 2.3 Correlate Errors with Changes + - **Critical Step**: Map errors from audit data to specific files/lines changed in the PR + - Look for patterns: + - Syntax errors → Check which files introduced new code + - Test failures → Check which tests or code under test were modified + - Build errors → Check build configuration changes + - Linting errors → Check which files triggered linter failures + - Type errors → Check type definitions or usage changes + - Import errors → Check dependency or import statement changes + - Identify the most likely culprit files and lines + + ### 2.4 Determine Root Cause + - Synthesize findings from audit data and PR changes + - Identify the specific change that caused the failure + - Determine if the issue is: + - A coding error (syntax, logic, types) + - A test issue (missing test, incorrect assertion) + - A configuration problem (build config, dependencies) + - An infrastructure issue (CI/CD, environment) + - **Only proceed to step 3 if you have a clear, actionable root cause** + +3. **Create Agent Task** (Only if root cause found): + + If you've identified a clear, fixable root cause in the PR code: + + - Create an agent task for Copilot to fix the issue using: + ```bash + gh agent-task create -F - <