[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-02-22 #17699

2026-02-22T12:45:06Z

github-actions[bot]
bot Feb 22, 2026

Daily NLP-based clustering analysis of copilot agent task prompts from the last 30 days (2026-01-21 → 2026-02-22).

Summary

Metric	Value
Analysis Period	2026-01-21 → 2026-02-22
Total PRs Analyzed	1,801
Clusters Identified	9
Overall Merge Rate	68.7% (1,237 merged / 1,801 total)
Most Common Category	Workflow & MCP Updates (741 PRs, 41%)
Highest Merge Rate	Custom Agent Workflows (79%)
Lowest Merge Rate	Code Quality / Task Mining (52%)
Weekly Task Volume Trend	↑ Increasing (279 → 346 tasks/week)

Cluster Overview

#	Category	PRs	Share	Merge Rate	Avg Files Changed	Top Keywords
1	Workflow & MCP Updates	741	41%	70%	25.5	`workflow, update, mcp, add, cli`
2	Issue-driven Agent Tasks	304	17%	63%	16.4	`issue, section, copilot, resolve`
3	Safe Outputs Implementation	157	9%	78%	27.9	`safe, outputs, safe outputs, handler`
4	Agentic Workflow Debugging	133	7%	68%	16.4	`agentic workflows, debug, prompt`
5	Code Quality / Task Mining	130	7%	52%	11.0	`code quality, task miner, improvement`
6	CI Failure / Run Fixes	119	7%	69%	14.6	`run, failure, failed, ci, patch`
7	Root Cause Bug Fixes	76	4%	75%	11.2	`job, fix, identify, failing, root cause`
8	Custom Agent Workflows	78	4%	79%	3.9	`custom agent, agent used, github actions`
9	Campaign / Feature Work	63	4%	68%	9.1	`campaign, security, project, dispatch`

Cluster Details with Representative Examples

1. Workflow & MCP Updates (741 PRs — 41% of total)

The dominant category. Covers updates to workflow files, MCP server dependency bumps, CLI feature additions, and compile/init command enhancements.

Merge Rate: 70% (522 merged, 219 closed/open)
Avg Files Changed: 25.5 (highest complexity)
Top Keywords: workflow, update, mcp, add, pr, make, review, agentic, file, cli

Representative prompts:

Run the update command and ensure that the Sentry MCP is updated. It should be upgraded to version 0.27.0...
Update the init command behavior: when invoked without arguments, enter an interactive mode and prompt the user to select which agent engine to use: Copilot, Claude, or Codex...

Example PRs: #11050, #11058, #11064

2. Issue-driven Agent Tasks (304 PRs — 17% of total)

Tasks sourced directly from GitHub issues, typically formatted with (issue_title) / (issue_description) XML tags. Covers a broad range of features and bug fixes originating from the issue tracker.

Merge Rate: 63% (190 merged, 114 closed/open)
Avg Files Changed: 16.4
Top Keywords: issue, section, details, copilot, resolve, comments, original issue

Representative prompts:

[deep-report] Install Go toolchain in Daily CLI Performance Agent workflow — The Daily CLI Performance Agent report workflow fails because Go is not installed...
Agentic Maintenance improvements — merge close issues and close discussions in same job, add extensive logging in close issues step...

Example PRs: #11059, #11060, #11067

3. Safe Outputs Implementation (157 PRs — 9% of total)

Tasks specifically targeting the safe-outputs system: validation, error handling, ANSI stripping, compile-time checks, and JSON schema additions.

Merge Rate: 78% (123 merged, 34 closed/open)
Avg Files Changed: 27.9 (highest alongside MCP Updates)
Top Keywords: safe, outputs, safe outputs, safe output, output, handler, project, create

Representative prompts:

When a GitHub Actions expression in target fails to evaluate, safe output handlers fail silently with unclear errors...
ANSI terminal escape sequences (\x1b[31m, \x1b[0m) were breaking YAML parsing when accidentally introduced through copy-paste from colored terminal output...

Example PRs: #11066, #11068, #11112

4. Agentic Workflow Debugging (133 PRs — 7% of total)

Tasks focused on debugging and improving the agentic workflow system itself: failure tracking, issue templates, prompt clustering, and agent orchestration fixes.

Merge Rate: 68% (91 merged, 42 closed/open)
Avg Files Changed: 16.4
Top Keywords: agentic workflows, agentic, workflows, debug, upgrade, prompt, create

Representative prompts:

Update the template used to create the parent issue for all agentic-workflow issues so that it creates a conclusion job...
If workflows are not in sync and the gh-aw-agent-token secret is available, then the workflow should automatically assign @copilot to the issue...

Example PRs: #11053, #11054, #11090

5. Code Quality / Task Mining (130 PRs — 7% of total)

Tasks generated by the task miner from code quality discussions: refactoring large files, adding test coverage, extracting helper functions, improving documentation, and fixing shell check warnings.

Merge Rate: 52% (67 merged, 63 closed/open) — lowest in all clusters
Avg Files Changed: 11.0
Top Keywords: quality, code quality, code, discussion, improvement, task miner, discussion task, miner

Representative prompts:

[Code Quality] Create test file for compiler_safe_outputs.go — The file has 499 lines with no existing test file...
[Code Quality] Refactor ParseWorkflowFile to reduce complexity — The function currently has a complexity score of 28...

Example PRs: #11587, #11592, #11593

6. CI Failure / Run Fixes (119 PRs — 7% of total)

Automated tasks triggered by CI failures, often generated by the "CI Failure Doctor" workflow. Prompts include job IDs, run URLs, and ask the agent to identify root causes and implement fixes.

Merge Rate: 69% (82 merged, 37 closed/open)
Avg Files Changed: 14.6
Top Keywords: run, failure, workflow, failed, ci, ci failure, patch, workflow run

Representative prompts:

🤖 AI generated by CI Failure Doctor — Fix the failing GitHub Actions workflow lint-go. Analyze the workflow logs, identify the root cause of the failure, and implement a fix. Job ID: 61758345655...

Example PRs: #11069, #11915, #12304

7. Root Cause Bug Fixes (76 PRs — 4% of total)

Targeted bug-fix tasks where the agent is asked to identify the root cause of a specific job or test failure and implement a fix. More targeted than the CI Failure cluster — typically involve specific failing job IDs and log analysis.

Merge Rate: 75% (57 merged, 19 closed/open)
Avg Files Changed: 11.2
Top Keywords: job, fix, identify, failing, id, root cause, workflow, implement, logs

Representative prompts:

Fix the failing GitHub Actions workflow js. Analyze the workflow logs, identify the root cause of the failure, and implement a fix. Job ID: 61070763482...

Example PRs: #11096, #11915, #12304

8. Custom Agent Workflows (78 PRs — 4% of total)

Tasks submitted via custom agentic workflows (e.g., ci-cleaner, agentic-workflows). Prompts typically include a **Custom agent used:** suffix identifying the triggering workflow. Tends to involve light-touch tasks (docs, small features).

Merge Rate: 79% (62 merged, 16 closed/open) — tied for highest
Avg Files Changed: 3.9 — lowest complexity, smallest scope
Top Keywords: custom agent, agent used, used, custom, github, docs, agent, documentation

Representative prompts:

Update the pdf-summarizer agentic workflow: update title to "pdf summarizer", instruct the agent to create a discussion with the result. Custom agent used: ci-cleaner
Add a codemod to repair MCP network configuration into the top level network configuration...

Example PRs: #11083, #11105, #11110

9. Campaign / Feature Work (63 PRs — 4% of total)

Tasks related to the campaign system: label-based discovery, orchestration, dispatch workers, security features, and structured project management.

Merge Rate: 68% (43 merged, 20 closed/open)
Avg Files Changed: 9.1
Avg Commits: 4.9 — highest iteration count
Top Keywords: campaign, security, project, issue, fix, docs, run, workflows, code

Representative prompts:

Don't rely on cache memory for campaign discovery but use labels. Each campaign issue (epic or worker) should get a label "agentic-campaign"...
For campaigns, make workers first-class "campaign workers" and keep orchestration concerns explicit, rather than relying on fusion as a permanent crutch...

Example PRs: #11070, #11080, #11087

Merge Rate Comparison Table

Category	PRs	Merged	Closed/Open	Merge Rate	Avg Files
Custom Agent Workflows	78	62	16	79%	3.9
Safe Outputs Implementation	157	123	34	78%	27.9
Root Cause Bug Fixes	76	57	19	75%	11.2
Workflow & MCP Updates	741	522	219	70%	25.5
CI Failure / Run Fixes	119	82	37	69%	14.6
Campaign / Feature Work	63	43	20	68%	9.1
Agentic Workflow Debugging	133	91	42	68%	16.4
Issue-driven Agent Tasks	304	190	114	63%	16.4
Code Quality / Task Mining	130	67	63	52%	11.0
TOTAL	1,801	1,237	564	68.7%	—

Key Findings

Workflow & MCP Updates dominates at 41% of all tasks (741 PRs). This reflects the active development and maintenance cadence of the gh-aw system itself — dependency bumps, MCP server upgrades, and CLI enhancements make up the single largest task category.
Merge rates vary substantially by category (52%–79%). Custom Agent Workflows and Safe Outputs tasks merge most reliably. Code Quality / Task Mining has the lowest merge rate at 52%, suggesting many mined tasks are either too vague, duplicate existing work, or require more investigation than the agent can complete in one pass.
Task volume is growing week-over-week (279 → 346 tasks/week), indicating increasing reliance on the copilot agent for day-to-day engineering work.
Smallest-scope tasks merge most reliably. Custom Agent Workflows average just 3.9 files changed with a 79% merge rate, while the largest-scope tasks (Safe Outputs, Workflow & MCP Updates at ~27 files) still achieve 70–78% — suggesting the agent handles complex tasks well when prompts are precise.
Campaign / Feature Work requires the most iterations (avg 4.9 commits/PR vs ~3 for simpler clusters), consistent with the architectural nature of campaign system changes.

Recommendations

Review Code Quality / Task Mining prompt templates. With a 52% merge rate and 130 PRs, this is the highest-volume low-success cluster. Mined tasks should include clearer acceptance criteria, links to specific failing tests, and explicit scope boundaries. Consider adding a "definition of done" section to task miner output.
Break down Safe Outputs and Workflow & MCP Update tasks. Both categories change ~26–28 files on average per PR. While merge rates are still good (70–78%), splitting large multi-file tasks into atomic sub-tasks would reduce reviewer burden and decrease the risk of partial rework.
Standardize CI Failure prompts with structured context. The CI Failure / Run Fixes cluster (119 PRs, 69% merge rate) benefits from including Job IDs and run URLs. Ensure all automated failure-fix prompts include: workflow name, job ID, run URL, relevant log lines, and expected behavior.
Leverage Custom Agent Workflow patterns. The smallest and most reliably-merged cluster uses focused, single-concern prompts triggered by specialized workflows. Applying this "narrow scope + known agent context" pattern to other categories could improve merge rates across the board.
Monitor Code Quality campaign effectiveness. The task miner generates 130 PRs/month (7%) with 48% failing to merge — this represents engineering time that could be better spent. Consider a quality gate on mined tasks before dispatching to the agent.

References: §22277077825

AI generated by Copilot Agent Prompt Clustering Analysis

expires on Mar 1, 2026, 12:45 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-02-22 #17699

Uh oh!

{{title}}

Uh oh!

1. Workflow & MCP Updates (741 PRs — 41% of total)

2. Issue-driven Agent Tasks (304 PRs — 17% of total)

3. Safe Outputs Implementation (157 PRs — 9% of total)

4. Agentic Workflow Debugging (133 PRs — 7% of total)

5. Code Quality / Task Mining (130 PRs — 7% of total)

6. CI Failure / Run Fixes (119 PRs — 7% of total)

7. Root Cause Bug Fixes (76 PRs — 4% of total)

8. Custom Agent Workflows (78 PRs — 4% of total)

9. Campaign / Feature Work (63 PRs — 4% of total)

Replies: 0 comments

Select a reply

Uh oh!

[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-02-22 #17699

Uh oh!

github-actions[bot] bot Feb 22, 2026

Summary

Cluster Overview

1. Workflow & MCP Updates (741 PRs — 41% of total)

2. Issue-driven Agent Tasks (304 PRs — 17% of total)

3. Safe Outputs Implementation (157 PRs — 9% of total)

4. Agentic Workflow Debugging (133 PRs — 7% of total)

5. Code Quality / Task Mining (130 PRs — 7% of total)

6. CI Failure / Run Fixes (119 PRs — 7% of total)

7. Root Cause Bug Fixes (76 PRs — 4% of total)

8. Custom Agent Workflows (78 PRs — 4% of total)

9. Campaign / Feature Work (63 PRs — 4% of total)

Key Findings

Recommendations

Replies: 0 comments

github-actions[bot]
bot Feb 22, 2026