[copilot-cli-research] Copilot CLI Deep Research - 2026-02-21 #17575
Replies: 1 comment
-
|
🤖 ARM64 smoke test agent was here! 🦾 Greetings from the land of aarch64! Your friendly neighborhood ARM64 Copilot agent swooped in, flexed its silicon muscles, built the entire gh-aw project from scratch, and left without touching a single x86 instruction. If this workflow were a gym, we'd be lifting with 64-bit wide registers. 💪 [ARM64 smoke test run 22264827050 — multi-arch and loving it]
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-02-21 | Repository: github/gh-aw | Triggered by:
@pelikhanScope: 157 total workflows, 75 using Copilot engine (48%) | Run: §22264457167
Executive Summary
This is a comprehensive analysis of GitHub Copilot CLI feature utilization across 75 Copilot-engine workflows in this repository. The analysis reveals 8 significant missed opportunities, with the most impactful being the complete non-adoption of
safe-inputs(0% usage despite full support), zero model pinning, and only 19% of workflows using the AWF network firewall sandbox.The repository demonstrates mature patterns in areas like safe-outputs (97%), timeout configuration (99%), and bash tooling (73%). However, there are clear gaps in security hardening (strict mode, rate-limiting, AWF), advanced engine configuration, and newer features like safe-inputs and plugins.
Primary Recommendation: Adopt
safe-inputsfor workflows that currently usebash: ["*"]or shell commands for API calls — this would improve security, testability, and prompt clarity for ~30 workflows.Critical Findings
🔴 High Priority Issues
🟡 Medium Priority Opportunities
1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Available Engine Configuration Options
Available CLI Flags (auto-configured by compiler)
--disable-builtin-mcps— always added--allow-all-paths— added whenedit:tool enabled--allow-all-tools— added whenbash: ["*"]wildcard used--allow-tool (tool)— per-tool permission grants--add-dir (path)— directory access grants--log-level all --log-dir— always added for logging--agent (id)— whenengine.agentis setAvailable Sandbox Options
sandbox.agent: awf) — network firewalling with allowlistAvailable Tools
bash["*"]grants alleditgithubweb-fetchplaywrightserenacache-memoryrepo-memoryagentic-workflowssafe-inputspluginsAvailable Network Domains (in
network.allowed)defaults,github,node,python,go,playwright,node-cdns,fonts,containers,*View Feature Adoption Statistics
Timeout distribution (among 74 with timeout):
2️⃣ Feature Usage Matrix
3️⃣ Missed Opportunities
🔴 High Priority: safe-inputs (0% adoption)
Opportunity: Replace bash API calls with safe-inputs tools
What:
safe-inputslets you define typed MCP tools inline in the workflow frontmatter using JavaScript, shell, or Python. These tools are mounted as an MCP server at runtime.Why It Matters:
bash: ["*"]— no arbitrary command executionstrictmode (bash wildcard breaks strict behavior)safeinputs-*shared importsWhere: ~30 workflows use
bash: ["*"](wildcard) granting unrestricted shell access. Many of these could be replaced with targetedsafe-inputstool definitions.How to Implement:
Expected Benefits: Reduced attack surface, typed tool definitions, cleaner prompts, compatible with strict mode.
🔴 High Priority: Model Pinning (0% adoption)
Opportunity: Pin models for cost/quality predictability
What: No Copilot workflow specifies a model. All rely on the
GH_AW_MODEL_AGENT_COPILOTrepo variable (or Copilot CLI default).Why It Matters:
gpt-5.1-codex-minito reduce costsContrast: 8 non-Copilot workflows already pin models:
changeset.md,chroma-issue-indexer.md,ci-doctor.md,daily-fact.md,issue-monster.md→gpt-5.1-codex-minipoem-bot.md→gpt-5How to Implement:
Suggested Tiering:
gpt-5.1-codex-mini): triage, fact, changeset, daily reports, simple checksgpt-5): complex analysis, code generation, multi-step reasoning🔴 High Priority: AWF Network Firewall (19% adoption)
Opportunity: Add network firewalling to sensitive workflows
What: Only 14/75 Copilot workflows use the AWF sandbox (
sandbox: agent: awf), leaving 61 workflows with unrestricted outbound network access.Why It Matters:
bash: ["*"]+ no network sandbox are high riskCandidates for AWF (workflows accessing sensitive data without firewall):
daily-secrets-analysis.md— analyzes secrets; no AWFbot-detection.md— security-sensitive; no AWFsecurity-compliance.md— security workflow; no AWFdaily-malicious-code-scan.md— scans for malicious code; no AWFHow to Implement:
🟡 Medium Priority: rate-limit (3% adoption)
Opportunity: Protect user-triggered workflows from abuse
What: Only 2/75 Copilot workflows have
rate-limitconfigured. Many user-triggered workflows (reaction, slash_command, issue_comment) have no rate limiting.Why It Matters:
rate-limitprevents abuse by limiting invocations per time windowAffected workflows (triggered by user interaction without rate-limit):
plan.md,craft.md,grumpy-reviewer.md,pr-nitpick-reviewer.mdHow to Implement:
🟡 Medium Priority: engine.args & engine.env (0-1% adoption)
Opportunity: Leverage advanced engine configuration
What:
engine.args(0% usage) andengine.env(1% usage) allow custom CLI arguments and environment variable injection.Use cases not being leveraged:
engine.args— Custom CLI flags:engine.env— Custom environment variables:Why It Matters: Many workflows that need custom configuration currently embed it in the prompt body (less clean, harder to maintain) or rely on GitHub Actions variables.
🟢 Low Priority: Plugins (0% adoption)
Opportunity: Explore Copilot plugin ecosystem
What: The Copilot engine has
supportsPlugins: trueand includes plugin installation steps, but no workflow in the repository usesplugins:.Why It Matters:
How to Implement:
Note: Investigate available plugins in the Copilot ecosystem before adoption.
🟢 Low Priority: web-search in Copilot workflows
Issue: firewall-escape.md uses unsupported web-search tool
firewall-escape.mdhasweb-search:in its tools config, but the Copilot engine hassupportsWebSearch: false. This generates a compiler warning but doesn't fail. The workflow should either switch toengine: claudeorengine: codex(both support web-search), or useweb-fetch:instead.4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
daily-secrets-analysis.mdsandbox: agent: awf+network.allowed: [defaults, github]plan.md,craft.md,grumpy-reviewer.md,pr-nitpick-reviewer.mdrate-limit: {max: 5, window: 60}andstrict: truedev.mdtoolssection, no network config, relies on default modeltools.githubtoolset,strict: true, consider model pinfirewall-escape.mdweb-searchtool (unsupported by Copilot)web-search:withweb-fetch:or switch engine toclaudeHigh-complexity workflows (
archie.md,brave.md,smoke-copilot.md)5️⃣ Best Practice Guidelines
Based on this research, here are recommended best practices for Copilot CLI workflows:
Always set
strict: truefor workflows that accept user-generated content (reactions, slash commands, issue comments) to prevent prompt injection attacks.Add
rate-limitto all user-triggered workflows to prevent abuse and cost overruns.Consider
sandbox: agent: awffor workflows that handle sensitive data or don't need broad internet access. Always pair withnetwork.allowedto explicitly declare needed domains.Use
safe-inputsinstead ofbash: ["*"]when you need to call external APIs or run scripts. This reduces attack surface and creates typed, documentable tools.Pin models explicitly for workflows where cost/quality predictability matters. Use
gpt-5.1-codex-minifor simple tasks and standard/premium models for complex reasoning.Use
tracker-idfor workflows that create recurring issues/discussions to enable deduplication and update-in-place behavior.Use
importsfor shared functionality — 44% of workflows already do this effectively with shared components in.github/workflows/shared/.6️⃣ Action Items
Immediate Actions (quick wins):
strict: trueto the 4 slash_command workflows missing it (plan.md,craft.md,grumpy-reviewer.md,pr-nitpick-reviewer.md)firewall-escape.mdto useweb-fetch:instead ofweb-search:rate-limitto high-traffic user-triggered workflowsShort-term (this month):
safe-inputsin 2-3 workflows that currently usebash: ["*"]for API callsdaily-secrets-analysis.md,bot-detection.md,security-compliance.md)Long-term (this quarter):
bash: ["*"]workflows tosafe-inputspatternsView Research Methodology
Research Methodology
Data Collection:
.github/workflows/*.mdengine: copilotviagrep -l "engine: copilot"Code Analysis:
pkg/workflow/copilot_engine.gofor engine capabilitiespkg/workflow/copilot_engine_execution.gofor CLI arg constructionpkg/workflow/copilot_engine_tools.gofor tool permission logicpkg/workflow/copilot_mcp.gofor MCP server configurationpkg/constants/constants.gofor constants and defaultsdocs/src/content/docs/reference/engines.mdfor documented featuresAnalysis Date: 2026-02-21
Copilot CLI Version: v0.0.414
Research stored in:
/tmp/gh-aw/repo-memory/default/copilot-research-2026-02-21.mdReferences:
Beta Was this translation helpful? Give feedback.
All reactions