Skip to content

ci-analysis: structured output, MCP integration, and deep investigation guides#124240

Merged
lewing merged 45 commits intodotnet:mainfrom
lewing:skill/ci-analysis-json-summary
Feb 12, 2026
Merged

ci-analysis: structured output, MCP integration, and deep investigation guides#124240
lewing merged 45 commits intodotnet:mainfrom
lewing:skill/ci-analysis-json-summary

Conversation

@lewing
Copy link
Member

@lewing lewing commented Feb 10, 2026

ci-analysis: structured output, MCP integration, and deep investigation guides

Changes to Get-CIStatus.ps1 (+216/-132)

  • Add [CI_ANALYSIS_SUMMARY] JSON block — structured summary emitted at end of script with all key facts (builds, failed jobs, known issues, PR correlation, recommendation hint)
  • Replace 47-line if/elseif recommendation chain with a single recommendationHint field in JSON (one of: BUILD_SUCCESSFUL, KNOWN_ISSUES_DETECTED, LIKELY_PR_RELATED, POSSIBLY_TRANSIENT, REVIEW_REQUIRED, MERGE_CONFLICTS, NO_BUILDS)
  • Add failedJobDetails to JSON — per-job errorCategory (test-failure, build-error, test-timeout, crash, tests-passed-reporter-failed, unclassified), errorSnippet, and helixWorkItems
  • Add failedJobDetailsTruncated — boolean flag indicating when -MaxJobs cap means failedJobDetails is incomplete vs failedJobNames
  • Add top-level knownIssues from Build Analysis (not per-job — Build Analysis reports at the PR level, not per-job)
  • Add timeout pattern to Format-TestFailure — catches Timed Out (timeout that was previously invisible
  • Show log tail in PR mode when no failure pattern matches (Helix Job mode already did this)
  • Add accumulation variables for cross-build aggregation (totalFailedJobs, totalLocalFailures, lastBuildJobSummary)
  • Fix early-continue scoping bug — job summary computation was at end of build loop, after 3 continue paths that skipped it
  • Fix empty array falsy checkif ($listFiles) → proper count check
  • Fix mergeable_state trimminggh api --jq output trimmed to prevent whitespace comparison failures
  • Remove interpretive prose — "These failures are likely PR-related" moved from script to agent reasoning
  • Fix empty catch — merge state error now logged via Write-Verbose

Changes to SKILL.md (+97/-144, net reduction)

  • Add Step 0: Gather Context — PR type classification table (code, flow, backport, merge, dependency update)
  • Add Step 3: Verify before claiming — systematic checklist
  • Add build progression analysis (Step 2, item 4) — comparing pass/fail across PR builds to narrow down which commit introduced a failure
  • Add prior-build mismatch detection (Step 2, item 6) — ask user when they reference jobs not in current results
  • Document failedJobDetails — per-failure error categories in Interpreting Results
  • Add Build Analysis check status enforcement — red check means unaccounted failures exist, never claim "all known" when it's red
  • Add timeout recovery workflow — explicit guidance for verifying timed-out builds have passing Helix results via hlx_status
  • Add crash/canceled job recovery procedure — step-by-step using hlx_batch_status, hlx_files, hlx_download_url to recover results from crashed Helix work items
  • Fix MCP tool references — use canonical short-form tool names consistently
  • Condense anti-patterns — tighter, more targeted, near relevant steps
  • Net token reduction — despite adding new content, SKILL.md shrank from ~4.6K to ~3.5K tokens

New reference files

  • references/azure-cli.md — Azure CLI deep investigation guide
  • references/binlog-comparison.md — binlog comparison workflow
  • references/delegation-patterns.md — subagent delegation patterns (5 patterns including parallel artifact extraction and canceled job recovery)
  • references/build-progression-analysis.md — commit-to-build correlation using triggerInfo.pr.sourceSha, SQL-based progression tracking, MCP-first with AzDO MCP tools as primary

Updated reference files

  • references/manual-investigation.md — fix nonexistent msbuild-mcp analyze tool refs, use real mcp-binlog-tool-* tools

Design principles

  • Data/reasoning boundary: Script emits structured JSON facts → agent synthesizes recommendations. No more canned prose from the script.
  • MCP-first: AzDO MCP tools (get_builds, get_build_log_by_id) and Helix MCP tools (hlx_status, hlx_logs) positioned as primary, CLI/script as fallback.
  • Token budget: Orchestrating SKILL.md kept within 2K-4K token budget by extracting depth to references/.
  • SQL for structured investigations: Build progression tracking uses SQL tables to persist SHAs across context, enabling queries for pass→fail transitions and target branch movement.

Testing

Copilot AI review requested due to automatic review settings February 10, 2026 19:30
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 10, 2026
@lewing lewing added area-skills Agent Skills and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Feb 10, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the ci-analysis Copilot skill to follow the “data vs. reasoning boundary” pattern by having the script emit a structured JSON summary block ([CI_ANALYSIS_SUMMARY]) and moving recommendation synthesis into the agent guidance in SKILL.md.

Changes:

  • Added Helix ListFiles-based work item file retrieval to avoid broken artifact URIs from the Details endpoint.
  • Replaced the script’s canned recommendation if/elseif chain with a structured [CI_ANALYSIS_SUMMARY] JSON block (including correlation and hint fields).
  • Updated SKILL.md to instruct agents to generate recommendations from the JSON summary (decision table + nuance guidance).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
.github/skills/ci-analysis/scripts/Get-CIStatus.ps1 Adds Helix ListFiles workaround and emits [CI_ANALYSIS_SUMMARY] JSON instead of canned recommendation prose.
.github/skills/ci-analysis/SKILL.md Documents the new workflow: parse JSON summary + agent-generated recommendations, with decision logic guidance.

@lewing lewing force-pushed the skill/ci-analysis-json-summary branch from 243fffa to c0fc5fe Compare February 10, 2026 19:38
Copilot AI review requested due to automatic review settings February 10, 2026 19:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@lewing lewing force-pushed the skill/ci-analysis-json-summary branch from 6cd308e to d10ea41 Compare February 10, 2026 20:07
Copilot AI review requested due to automatic review settings February 10, 2026 20:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

… reasoning

Apply the Data vs. Reasoning Boundary pattern:
- Script emits [CI_ANALYSIS_SUMMARY] JSON block with structured facts
  (totalFailedJobs, failedJobNames, knownIssues, prCorrelation, recommendationHint)
- Removed 47-line if/elseif recommendation chain producing canned prose
- Added 'Generating Recommendations' section to SKILL.md with decision table
- Updated 'Presenting Results' to reference JSON summary flow
- Agent now reasons over structured data instead of parroting script output

Tested with Claude Sonnet 4 and GPT-5 against PR dotnet#124232 — both rated
JSON completeness 4/5 and generated better recommendations than the old
heuristic.
…before-claiming

- Add Step 0: Gather Context section with PR type classification table
  (code, flow, backport, merge, dependency update) that determines
  interpretation framework
- Add Step 3: Verify before claiming - systematic checklist before
  labeling failures as infrastructure/transient/PR-related
- Add structured output format (summary verdict, failure details,
  recommended actions)
- Replace 'main branch' with 'target branch' throughout - backports
  and release-branch PRs need comparison against their actual base,
  not main
- Remove redundant tip (covered by Step 0)
When ListFiles returns an empty array (0 files), the empty array is
falsy in PowerShell, causing fallback to the Details endpoint's broken
URIs. Use \ -ne check instead.
@lewing lewing force-pushed the skill/ci-analysis-json-summary branch from f264d86 to 9a3194c Compare February 10, 2026 20:22
Copilot AI review requested due to automatic review settings February 10, 2026 20:31
@lewing lewing force-pushed the skill/ci-analysis-json-summary branch from cf462f6 to 5cf7c95 Compare February 10, 2026 20:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

@lewing lewing requested a review from steveisok February 10, 2026 20:37
… ~3.3K tokens

Move 'Deep Investigation with Azure CLI' section (97 lines) and
detailed 'Recovering Results from Canceled Jobs' steps to references/.
Content already exists in references/azure-cli.md. Remove duplicate
'Canceled != Failed' callout. SKILL.md is now ~3.3K tokens, within
the 2K-4K target for script-driven skills.
When a user asks about a job/error/cancellation that doesn't appear in
the current build results, the agent should ask if they're referring to
a prior build rather than silently missing context. Added Step 2 item 5
with concrete triggers: empty canceledJobNames when user mentions
cancellations, green build when user says CI is failing, missing job
names. Offers to re-run with -BuildId for the earlier build.
Copilot AI review requested due to automatic review settings February 10, 2026 20:58
Copilot AI review requested due to automatic review settings February 12, 2026 03:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2140

  • In the non-Helix (build failure) path, jobDetail.errorCategory is only set to build-error when Extract-BuildErrors finds matches. If extraction returns 0 errors (pattern miss, truncated logs, etc.), the JSON will report unclassified with an empty snippet even though the job is clearly a build failure. Consider setting errorCategory to build-error as soon as you enter the non-Helix branch, and populating errorSnippet with something deterministic (e.g., failed task name and/or a short tail of the task log) when no errors are extracted.
                    # No Helix tasks - this is a build failure, extract actual errors
                    $buildTasks = $timeline.records | Where-Object {
                        $_.parentId -eq $job.id -and $_.result -eq "failed"
                    }

                    foreach ($task in $buildTasks | Select-Object -First 3) {
                        Write-Host "  Failed task: $($task.name)" -ForegroundColor Red

                        # Fetch and parse the build log for actual errors
                        if ($task.log) {
                            $logUrl = "https://dev.azure.com/$Organization/$Project/_build/results?buildId=$currentBuildId&view=logs&j=$($job.id)&t=$($task.id)"
                            Write-Host "  Log: $logUrl" -ForegroundColor Gray
                            $logContent = Get-BuildLog -Build $currentBuildId -LogId $task.log.id

                            if ($logContent) {
                                $buildErrors = Extract-BuildErrors -LogContent $logContent

                                if ($buildErrors.Count -gt 0) {
                                    # Collect for PR correlation
                                    $allFailuresForCorrelation += @{
                                        TaskName = $task.name
                                        JobName = $job.name
                                        Errors = $buildErrors
                                        HelixLogs = @()
                                        FailedTests = @()
                                    }
                                    $jobDetail.errorCategory = "build-error"
                                    if (-not $jobDetail.errorSnippet) {
                                        $snippet = ($buildErrors | Select-Object -First 2) -join "; "
                                        $jobDetail.errorSnippet = $snippet.Substring(0, [Math]::Min(200, $snippet.Length))
                                    }

                                    # Extract Helix log URLs from the full log content
                                    $helixLogUrls = Extract-HelixLogUrls -LogContent $logContent

                                    if ($helixLogUrls.Count -gt 0) {
                                        Write-Host "  Helix failures ($($helixLogUrls.Count)):" -ForegroundColor Red
                                        foreach ($helixLog in $helixLogUrls | Select-Object -First 5) {
                                            Write-Host "    - $($helixLog.WorkItem)" -ForegroundColor White
                                            Write-Host "      Log: $($helixLog.Url)" -ForegroundColor Gray
                                        }
                                        if ($helixLogUrls.Count -gt 5) {
                                            Write-Host "    ... and $($helixLogUrls.Count - 5) more" -ForegroundColor Gray
                                        }
                                    }
                                    else {
                                        Write-Host "  Build errors:" -ForegroundColor Red
                                        foreach ($err in $buildErrors | Select-Object -First 5) {
                                            Write-Host "    $err" -ForegroundColor White
                                        }
                                        if ($buildErrors.Count -gt 5) {
                                            Write-Host "    ... and $($buildErrors.Count - 5) more errors" -ForegroundColor Gray
                                        }
                                    }

                                    # Search for known issues
                                    Show-KnownIssues -ErrorMessage ($buildErrors -join "`n") -IncludeMihuBot:$SearchMihuBot
                                }
                                else {
                                    Write-Host "  (No specific errors extracted from log)" -ForegroundColor Gray
                                }
                            }

@lewing lewing enabled auto-merge (squash) February 12, 2026 03:23
@lewing lewing disabled auto-merge February 12, 2026 03:23
@lewing lewing enabled auto-merge (squash) February 12, 2026 03:49
Copilot AI review requested due to automatic review settings February 12, 2026 04:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Copilot AI review requested due to automatic review settings February 12, 2026 05:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2150

  • $allFailedJobDetails += $jobDetail is executed only on the success path. If an exception is thrown while processing a failed job and -ContinueOnError is set, that job will be missing from failedJobDetails even though it’s still in failedJobNames, and the missing entry will also affect failedJobDetailsTruncated. Consider appending a best-effort jobDetail in a finally block (and populating errorSnippet from the exception) so the JSON remains structurally complete.
            $allFailedJobDetails += $jobDetail
            $processedJobs++
        }
        catch {
            $errorCount++

Copilot AI review requested due to automatic review settings February 12, 2026 05:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

.github/skills/ci-analysis/scripts/Get-CIStatus.ps1:2011

  • Extract-HelixUrls is called even when $logContent is $null (it’s outside the if ($logContent) block). Extract-HelixUrls does regex matching on the content and can throw on null input, which will abort processing this job/build. Move the Helix URL extraction inside the $logContent guard (or make Extract-HelixUrls return an empty array when $LogContent is null/empty).
                            # Extract and optionally fetch Helix URLs
                            $helixUrls = Extract-HelixUrls -LogContent $logContent

@davidfowl
Copy link
Member

This looks like a banger.

@lewing lewing merged commit e3ab1f7 into dotnet:main Feb 12, 2026
23 checks passed
@lewing lewing deleted the skill/ci-analysis-json-summary branch February 12, 2026 16:02
lewing added a commit that referenced this pull request Feb 13, 2026
…natives (#124359)

Follow-up to #124240 with three improvements to the ci-analysis skill:

### Changes

**MSBuild cross-platform guidance** (build-progression-analysis.md)
- Added anti-pattern warning about MSBuild property path separator
differences (`;` vs `:`) when comparing binlogs across Windows/Linux
Helix queues
- This is a common false positive in build progression analysis

**Merge commit shortcut for target SHA extraction**
(build-progression-analysis.md)
- Added Step 2b shortcut: `gh api
repos/{owner}/{repo}/git/commits/{sourceVersion} --jq '.parents[0].sha'`
- Extracts target branch HEAD from the merge commit's first parent —
much simpler than parsing checkout logs
- Noted caveat: only works for the latest build (GitHub recomputes merge
ref on each push)
- Added `get_commit` MCP tool as alternative when available

**Inline MCP tool alternatives** (all 4 files)
- Added `pull_request_read` as alternative to `gh pr checks` in SKILL.md
- Added `search_issues` MCP note in azdo-helix-reference.md
- Added `get_commit` MCP note in build-progression-analysis.md
- Reframed azure-cli.md with one-sentence MCP-first preamble

All changes are minimal inline additions — no structural changes to the
skill.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-skills Agent Skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants