ci-analysis: structured output, MCP integration, and deep investigation guides by lewing · Pull Request #124240 · dotnet/runtime

lewing · 2026-02-10T19:30:10Z

ci-analysis: structured output, MCP integration, and deep investigation guides

Changes to `Get-CIStatus.ps1` (+216/-132)

Add [CI_ANALYSIS_SUMMARY] JSON block — structured summary emitted at end of script with all key facts (builds, failed jobs, known issues, PR correlation, recommendation hint)
Replace 47-line if/elseif recommendation chain with a single recommendationHint field in JSON (one of: BUILD_SUCCESSFUL, KNOWN_ISSUES_DETECTED, LIKELY_PR_RELATED, POSSIBLY_TRANSIENT, REVIEW_REQUIRED, MERGE_CONFLICTS, NO_BUILDS)
Add failedJobDetails to JSON — per-job errorCategory (test-failure, build-error, test-timeout, crash, tests-passed-reporter-failed, unclassified), errorSnippet, and helixWorkItems
Add failedJobDetailsTruncated — boolean flag indicating when -MaxJobs cap means failedJobDetails is incomplete vs failedJobNames
Add top-level knownIssues from Build Analysis (not per-job — Build Analysis reports at the PR level, not per-job)
Add timeout pattern to Format-TestFailure — catches Timed Out (timeout that was previously invisible
Show log tail in PR mode when no failure pattern matches (Helix Job mode already did this)
Add accumulation variables for cross-build aggregation (totalFailedJobs, totalLocalFailures, lastBuildJobSummary)
Fix early-continue scoping bug — job summary computation was at end of build loop, after 3 continue paths that skipped it
Fix empty array falsy check — if ($listFiles) → proper count check
Fix mergeable_state trimming — gh api --jq output trimmed to prevent whitespace comparison failures
Remove interpretive prose — "These failures are likely PR-related" moved from script to agent reasoning
Fix empty catch — merge state error now logged via Write-Verbose

Changes to `SKILL.md` (+97/-144, net reduction)

Add Step 0: Gather Context — PR type classification table (code, flow, backport, merge, dependency update)
Add Step 3: Verify before claiming — systematic checklist
Add build progression analysis (Step 2, item 4) — comparing pass/fail across PR builds to narrow down which commit introduced a failure
Add prior-build mismatch detection (Step 2, item 6) — ask user when they reference jobs not in current results
Document failedJobDetails — per-failure error categories in Interpreting Results
Add Build Analysis check status enforcement — red check means unaccounted failures exist, never claim "all known" when it's red
Add timeout recovery workflow — explicit guidance for verifying timed-out builds have passing Helix results via hlx_status
Add crash/canceled job recovery procedure — step-by-step using hlx_batch_status, hlx_files, hlx_download_url to recover results from crashed Helix work items
Fix MCP tool references — use canonical short-form tool names consistently
Condense anti-patterns — tighter, more targeted, near relevant steps
Net token reduction — despite adding new content, SKILL.md shrank from ~4.6K to ~3.5K tokens

New reference files

references/azure-cli.md — Azure CLI deep investigation guide
references/binlog-comparison.md — binlog comparison workflow
references/delegation-patterns.md — subagent delegation patterns (5 patterns including parallel artifact extraction and canceled job recovery)
references/build-progression-analysis.md — commit-to-build correlation using triggerInfo.pr.sourceSha, SQL-based progression tracking, MCP-first with AzDO MCP tools as primary

Updated reference files

references/manual-investigation.md — fix nonexistent msbuild-mcp analyze tool refs, use real mcp-binlog-tool-* tools

Design principles

Data/reasoning boundary: Script emits structured JSON facts → agent synthesizes recommendations. No more canned prose from the script.
MCP-first: AzDO MCP tools (get_builds, get_build_log_by_id) and Helix MCP tools (hlx_status, hlx_logs) positioned as primary, CLI/script as fallback.
Token budget: Orchestrating SKILL.md kept within 2K-4K token budget by extracting depth to references/.
SQL for structured investigations: Build progression tracking uses SQL tables to persist SHAs across context, enabling queries for pass→fail transitions and target branch movement.

Testing

Multi-model subagent testing (Sonnet 4 + GPT-5 + Opus 4.5) — two review rounds with findings addressed
Live MCP integration test confirmed hlx_status, hlx_logs, get_builds, get_build_log_by_id all work
Real-world validation against PRs [release/8.0] Update dependencies from dotnet/arcade #123245, [release/10.0] Rework and enable Wasm.Build.Tests.Blazor.AssetCachingTests #123883, Fix WASM boot config ContentRoot to use IntermediateOutputPath #124125, [main] Source code updates from dotnet/dotnet #124232

Copilot

Pull request overview

This PR updates the ci-analysis Copilot skill to follow the “data vs. reasoning boundary” pattern by having the script emit a structured JSON summary block ([CI_ANALYSIS_SUMMARY]) and moving recommendation synthesis into the agent guidance in SKILL.md.

Changes:

Added Helix ListFiles-based work item file retrieval to avoid broken artifact URIs from the Details endpoint.
Replaced the script’s canned recommendation if/elseif chain with a structured [CI_ANALYSIS_SUMMARY] JSON block (including correlation and hint fields).
Updated SKILL.md to instruct agents to generate recommendations from the JSON summary (decision table + nuance guidance).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
.github/skills/ci-analysis/scripts/Get-CIStatus.ps1	Adds Helix ListFiles workaround and emits `[CI_ANALYSIS_SUMMARY]` JSON instead of canned recommendation prose.
.github/skills/ci-analysis/SKILL.md	Documents the new workflow: parse JSON summary + agent-generated recommendations, with decision logic guidance.

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1

.github/skills/ci-analysis/SKILL.md

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

.github/skills/ci-analysis/SKILL.md

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

.github/skills/ci-analysis/references/delegation-patterns.md

.github/skills/ci-analysis/references/binlog-comparison.md

.github/skills/ci-analysis/SKILL.md

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1

.github/skills/ci-analysis/references/delegation-patterns.md

… reasoning Apply the Data vs. Reasoning Boundary pattern: - Script emits [CI_ANALYSIS_SUMMARY] JSON block with structured facts (totalFailedJobs, failedJobNames, knownIssues, prCorrelation, recommendationHint) - Removed 47-line if/elseif recommendation chain producing canned prose - Added 'Generating Recommendations' section to SKILL.md with decision table - Updated 'Presenting Results' to reference JSON summary flow - Agent now reasons over structured data instead of parroting script output Tested with Claude Sonnet 4 and GPT-5 against PR dotnet#124232 — both rated JSON completeness 4/5 and generated better recommendations than the old heuristic.

…before-claiming - Add Step 0: Gather Context section with PR type classification table (code, flow, backport, merge, dependency update) that determines interpretation framework - Add Step 3: Verify before claiming - systematic checklist before labeling failures as infrastructure/transient/PR-related - Add structured output format (summary verdict, failure details, recommended actions) - Replace 'main branch' with 'target branch' throughout - backports and release-branch PRs need comparison against their actual base, not main - Remove redundant tip (covered by Step 0)

…, fix base/target terminology

When ListFiles returns an empty array (0 files), the empty array is falsy in PowerShell, causing fallback to the Details endpoint's broken URIs. Use \ -ne check instead.

…coded path, update Three Modes table

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1

… ~3.3K tokens Move 'Deep Investigation with Azure CLI' section (97 lines) and detailed 'Recovering Results from Canceled Jobs' steps to references/. Content already exists in references/azure-cli.md. Remove duplicate 'Canceled != Failed' callout. SKILL.md is now ~3.3K tokens, within the 2K-4K target for script-driven skills.

When a user asks about a job/error/cancellation that doesn't appear in the current build results, the agent should ask if they're referring to a prior build rather than silently missing context. Added Step 2 item 5 with concrete triggers: empty canceledJobNames when user mentions cancellations, green build when user says CI is failing, missing job names. Offers to re-run with -BuildId for the earlier build.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2140

In the non-Helix (build failure) path, jobDetail.errorCategory is only set to build-error when Extract-BuildErrors finds matches. If extraction returns 0 errors (pattern miss, truncated logs, etc.), the JSON will report unclassified with an empty snippet even though the job is clearly a build failure. Consider setting errorCategory to build-error as soon as you enter the non-Helix branch, and populating errorSnippet with something deterministic (e.g., failed task name and/or a short tail of the task log) when no errors are extracted.

                    # No Helix tasks - this is a build failure, extract actual errors
                    $buildTasks = $timeline.records | Where-Object {
                        $_.parentId -eq $job.id -and $_.result -eq "failed"
                    }

                    foreach ($task in $buildTasks | Select-Object -First 3) {
                        Write-Host "  Failed task: $($task.name)" -ForegroundColor Red

                        # Fetch and parse the build log for actual errors
                        if ($task.log) {
                            $logUrl = "https://dev.azure.com/$Organization/$Project/_build/results?buildId=$currentBuildId&view=logs&j=$($job.id)&t=$($task.id)"
                            Write-Host "  Log: $logUrl" -ForegroundColor Gray
                            $logContent = Get-BuildLog -Build $currentBuildId -LogId $task.log.id

                            if ($logContent) {
                                $buildErrors = Extract-BuildErrors -LogContent $logContent

                                if ($buildErrors.Count -gt 0) {
                                    # Collect for PR correlation
                                    $allFailuresForCorrelation += @{
                                        TaskName = $task.name
                                        JobName = $job.name
                                        Errors = $buildErrors
                                        HelixLogs = @()
                                        FailedTests = @()
                                    }
                                    $jobDetail.errorCategory = "build-error"
                                    if (-not $jobDetail.errorSnippet) {
                                        $snippet = ($buildErrors | Select-Object -First 2) -join "; "
                                        $jobDetail.errorSnippet = $snippet.Substring(0, [Math]::Min(200, $snippet.Length))
                                    }

                                    # Extract Helix log URLs from the full log content
                                    $helixLogUrls = Extract-HelixLogUrls -LogContent $logContent

                                    if ($helixLogUrls.Count -gt 0) {
                                        Write-Host "  Helix failures ($($helixLogUrls.Count)):" -ForegroundColor Red
                                        foreach ($helixLog in $helixLogUrls | Select-Object -First 5) {
                                            Write-Host "    - $($helixLog.WorkItem)" -ForegroundColor White
                                            Write-Host "      Log: $($helixLog.Url)" -ForegroundColor Gray
                                        }
                                        if ($helixLogUrls.Count -gt 5) {
                                            Write-Host "    ... and $($helixLogUrls.Count - 5) more" -ForegroundColor Gray
                                        }
                                    }
                                    else {
                                        Write-Host "  Build errors:" -ForegroundColor Red
                                        foreach ($err in $buildErrors | Select-Object -First 5) {
                                            Write-Host "    $err" -ForegroundColor White
                                        }
                                        if ($buildErrors.Count -gt 5) {
                                            Write-Host "    ... and $($buildErrors.Count - 5) more errors" -ForegroundColor Gray
                                        }
                                    }

                                    # Search for known issues
                                    Show-KnownIssues -ErrorMessage ($buildErrors -join "`n") -IncludeMihuBot:$SearchMihuBot
                                }
                                else {
                                    Write-Host "  (No specific errors extracted from log)" -ForegroundColor Gray
                                }
                            }

.github/skills/ci-analysis/SKILL.md

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1

.github/skills/ci-analysis/references/sql-tracking.md

.github/skills/ci-analysis/references/manual-investigation.md

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1

… fix $�rror example

…nagement

…ool name examples

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2150

$allFailedJobDetails += $jobDetail is executed only on the success path. If an exception is thrown while processing a failed job and -ContinueOnError is set, that job will be missing from failedJobDetails even though it’s still in failedJobNames, and the missing entry will also affect failedJobDetailsTruncated. Consider appending a best-effort jobDetail in a finally block (and populating errorSnippet from the exception) so the JSON remains structurally complete.

            $allFailedJobDetails += $jobDetail
            $processedJobs++
        }
        catch {
            $errorCount++

.github/skills/ci-analysis/references/helix-artifacts.md

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2011

Extract-HelixUrls is called even when $logContent is $null (it’s outside the if ($logContent) block). Extract-HelixUrls does regex matching on the content and can throw on null input, which will abort processing this job/build. Move the Helix URL extraction inside the $logContent guard (or make Extract-HelixUrls return an empty array when $LogContent is null/empty).

                            # Extract and optionally fetch Helix URLs
                            $helixUrls = Extract-HelixUrls -LogContent $logContent

davidfowl · 2026-02-12T06:01:47Z

This looks like a banger.

…natives (#124359) Follow-up to #124240 with three improvements to the ci-analysis skill: ### Changes **MSBuild cross-platform guidance** (build-progression-analysis.md) - Added anti-pattern warning about MSBuild property path separator differences (`;` vs `:`) when comparing binlogs across Windows/Linux Helix queues - This is a common false positive in build progression analysis **Merge commit shortcut for target SHA extraction** (build-progression-analysis.md) - Added Step 2b shortcut: `gh api repos/{owner}/{repo}/git/commits/{sourceVersion} --jq '.parents[0].sha'` - Extracts target branch HEAD from the merge commit's first parent — much simpler than parsing checkout logs - Noted caveat: only works for the latest build (GitHub recomputes merge ref on each push) - Added `get_commit` MCP tool as alternative when available **Inline MCP tool alternatives** (all 4 files) - Added `pull_request_read` as alternative to `gh pr checks` in SKILL.md - Added `search_issues` MCP note in azdo-helix-reference.md - Added `get_commit` MCP note in build-progression-analysis.md - Reframed azure-cli.md with one-sentence MCP-first preamble All changes are minimal inline additions — no structural changes to the skill.

Copilot AI review requested due to automatic review settings February 10, 2026 19:30

github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 10, 2026

dotnet-policy-service bot assigned lewing Feb 10, 2026

lewing added area-skills Agent Skills and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Feb 10, 2026

Copilot started reviewing on behalf of lewing February 10, 2026 19:31 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

lewing force-pushed the skill/ci-analysis-json-summary branch from 243fffa to c0fc5fe Compare February 10, 2026 19:38

Copilot AI review requested due to automatic review settings February 10, 2026 19:49

Copilot started reviewing on behalf of lewing February 10, 2026 19:50 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

.github/skills/ci-analysis/SKILL.md Outdated Show resolved Hide resolved

lewing force-pushed the skill/ci-analysis-json-summary branch from 6cd308e to d10ea41 Compare February 10, 2026 20:07

Copilot AI review requested due to automatic review settings February 10, 2026 20:10

Copilot started reviewing on behalf of lewing February 10, 2026 20:11 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

lewing added 4 commits February 10, 2026 14:21

Address review: add missing reference files, fix BuildId mode wording…

c3cbad9

…, fix base/target terminology

Fix empty array falsy check in Get-HelixWorkItemDetails

9a3194c

When ListFiles returns an empty array (0 files), the empty array is falsy in PowerShell, causing fallback to the Details endpoint's broken URIs. Use \ -ne check instead.

lewing force-pushed the skill/ci-analysis-json-summary branch from f264d86 to 9a3194c Compare February 10, 2026 20:22

Copilot AI review requested due to automatic review settings February 10, 2026 20:31

Copilot started reviewing on behalf of lewing February 10, 2026 20:32 View session

Address review: fix target-branch refs in reference docs, remove hard…

5cf7c95

…coded path, update Three Modes table

lewing force-pushed the skill/ci-analysis-json-summary branch from cf462f6 to 5cf7c95 Compare February 10, 2026 20:32

Copilot AI reviewed Feb 10, 2026

View reviewed changes

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1 Outdated Show resolved Hide resolved

lewing requested a review from steveisok February 10, 2026 20:37

lewing added 2 commits February 10, 2026 14:40

Copilot AI review requested due to automatic review settings February 10, 2026 20:58

Copilot started reviewing on behalf of lewing February 10, 2026 20:59 View session

Merge branch 'main' into skill/ci-analysis-json-summary

7ba79d2

Copilot AI review requested due to automatic review settings February 12, 2026 03:14

Copilot started reviewing on behalf of lewing February 12, 2026 03:15 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

.github/skills/ci-analysis/SKILL.md Outdated Show resolved Hide resolved

Fix section reference: 'Recovering Results from Crashed/Canceled Jobs'

20cf6c7

lewing enabled auto-merge (squash) February 12, 2026 03:23

lewing disabled auto-merge February 12, 2026 03:23

lewing enabled auto-merge (squash) February 12, 2026 03:49

Add PR comment tracking pattern for deep analysis and PR chains

2566a31

Copilot AI review requested due to automatic review settings February 12, 2026 04:09

Copilot started reviewing on behalf of lewing February 12, 2026 04:10 View session

Fix plain-text cross-references to use markdown links

f9d9263

Copilot AI reviewed Feb 12, 2026

View reviewed changes

lewing added 2 commits February 11, 2026 22:50

Add buildId to failedJobDetails, include exit code -4 in crash regex,…

a3e2bdb

… fix $�rror example

Add downloaded artifact layout guide and SQL tracking for artifact ma…

9533720

…nagement

Copilot AI review requested due to automatic review settings February 12, 2026 05:02

Copilot started reviewing on behalf of lewing February 12, 2026 05:03 View session

Clarify build discovery scope, SQL table purposes, and use concrete t…

ce7083c

…ool name examples

Copilot AI reviewed Feb 12, 2026

View reviewed changes

.github/skills/ci-analysis/references/helix-artifacts.md Outdated Show resolved Hide resolved

lewing added 2 commits February 11, 2026 23:07

Soften binlog source guidance: AzDO and Helix boundaries aren't absolute

788a4f8

Clarify hlx_download vs hlx_download_url usage

c6a96ab

Copilot AI review requested due to automatic review settings February 12, 2026 05:09

Copilot started reviewing on behalf of lewing February 12, 2026 05:10 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

lewing merged commit e3ab1f7 into dotnet:main Feb 12, 2026
23 checks passed

lewing deleted the skill/ci-analysis-json-summary branch February 12, 2026 16:02

lewing mentioned this pull request Feb 12, 2026

ci-analysis skill: MSBuild guidance, merge commit shortcut, MCP alternatives #124359

Merged

dotnet-maestro bot mentioned this pull request Feb 13, 2026

[main] Source code updates from dotnet/runtime dotnet/dotnet#4839

Merged

Conversation

lewing commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!