Add daily exploratory testing workflow for CLI tools by Copilot · Pull Request #14168 · github/gh-aw

Copilot · 2026-02-06T15:57:09Z

Implements systematic testing of audit, logs, and compile commands via the agentic-workflows MCP server.

Workflow Configuration

Schedule: Daily (fuzzy, scattered to 17:44 UTC)
Timeout: 60 minutes
Engine: Copilot
Safe Outputs: create-issue (max 5, 7d TTL), noop

Testing Coverage

Phase 1-2: Logs Command

Basic download, filtering (workflow/engine/date/count)
Edge cases: non-existent workflows, invalid dates, future/past date handling
Performance benchmarking (target: <10s)

Phase 3: Audit Command

Successful, failed, and safe-output runs
Deep analysis: tool detection, MCP server calls, network activity, performance metrics
Edge cases: invalid run IDs, stale runs, in-progress runs
Performance target: <30s per audit

Phase 4: Compile Command

Bulk and targeted compilation (simple, complex, with imports)
Lock file validation and incremental detection
Error handling: malformed YAML, invalid tools, missing imports
Performance target: <5s per workflow

Phase 5-7: Integration & Usability

Cross-command workflows (compile→run→audit, logs→audit)
Resource usage monitoring
Error message quality and documentation gaps

Phase 8: Reporting

Categorized findings (critical/major/minor/enhancements)
Detailed issue templates with reproduction steps
Uses noop when all tests pass for transparency

Implementation Notes

The workflow uses the MCP server exclusively (not direct CLI) since authentication is not configured for direct gh aw invocation in workflow context. Testing approach is exploratory with systematic phases, covering happy paths, edge cases, and cross-command integration.

Original prompt

Create a daily agentic workflow that does a deep exploratory testing of the audit, logs and compile tools in the agentic-workflows tool. Use copilot.
Create issue if a problem is detected. Timeout 1h.

Custom agent used: agentic-workflows
GitHub Agentic Workflows (gh-aw) - Create, debug, and upgrade AI-powered workflows with intelligent prompt routing

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Removed permissions for the 'github/agentic-workflow-workbench' repository in the devcontainer configuration.

Copilot

Pull request overview

This pull request adds a new daily workflow that performs comprehensive exploratory testing of three critical CLI tools (gh aw audit, gh aw logs, and gh aw compile) via the agentic-workflows MCP server. The workflow runs daily with fuzzy scheduling (scattered to 17:44 UTC) and includes systematic testing phases covering happy paths, edge cases, performance benchmarks, and cross-command integration.

Changes:

Adds daily-cli-tools-tester.md workflow with 8 phases of exploratory testing covering environment discovery, logs testing, audit testing, compile testing, integration tests, performance benchmarks, usability assessment, and issue reporting
Generates corresponding lock file (daily-cli-tools-tester.lock.yml) configuring Copilot engine, MCP servers, safe outputs (create-issue max 5, noop), and 60-minute timeout
Reorders repository entries in devcontainer.json alphabetically

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
.github/workflows/daily-cli-tools-tester.md	New workflow markdown defining comprehensive CLI testing strategy with 8 phases, MCP-only tool usage, and systematic edge case coverage
.github/workflows/daily-cli-tools-tester.lock.yml	Generated lock file with Copilot engine configuration, agentic-workflows MCP server setup, safe outputs configuration (create-issue with 7d expiry, noop), and proper job orchestration
.devcontainer/devcontainer.json	Cosmetic reordering of github/agentic-workflow-workbench and github/gh-aw repository entries to alphabetical order

Comments suppressed due to low confidence (1)

.devcontainer/devcontainer.json:38

This change reorders the repository entries in the devcontainer configuration, swapping the positions of "github/agentic-workflow-workbench" and "github/gh-aw". This appears to be an unrelated cosmetic change that makes them alphabetically ordered. Consider whether this should be included in a PR focused on adding a CLI testing workflow, or if it should be split into a separate commit/PR for configuration cleanup.

        "github/gh-aw": {
          "permissions": {
            "actions": "write",
            "contents": "write",
            "discussions": "read",
            "issues": "read",
            "pull-requests": "write",
            "workflows": "write"
          }
        }
      }
    }
  },
  "features": {
    "ghcr.io/anthropics/devcontainer-features/claude-code:1.0": {},
    "ghcr.io/devcontainers/features/copilot-cli:latest": {},
    "ghcr.io/devcontainers/features/docker-in-docker:2": {},
    "ghcr.io/devcontainers/features/github-cli:1": {},
    "ghcr.io/devcontainers/features/node:1": {
      "version": "24"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T16:20:02Z

.github/workflows/daily-cli-tools-tester.md

+  actions: read
+tools:
+  agentic-workflows:
+  bash:


The workflow markdown includes bash code blocks using commands that are not in the allowed bash tools list. The frontmatter specifies bash: which defaults to a limited set of commands (echo, ls, pwd, cat, head, tail, grep, wc, sort, uniq, date, yq), but the workflow instructions include:

find command (lines 161, 164, 389, 392, 525)

for loops (lines 164, 392)

diff command (line 419)

time command (lines 497, 500, 503)

df command (line 522)

du command (line 528)

These commands will fail when the agent attempts to use them. Either:

Update the frontmatter to allow these specific bash commands: bash: ["find", "for", "diff", "time", "df", "du"] (which will merge with defaults), or

Remove or modify these bash code block examples to only use the default allowed commands, or

Use bash: ["*"] to allow all bash commands if that's the intended behavior for this testing workflow

Suggested change

bash:

bash: ["find", "for", "diff", "time", "df", "du"]

Copilot · 2026-02-06T16:20:02Z

.github/workflows/daily-cli-tools-tester.md

@@ -0,0 +1,669 @@
+---
+description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI


Consider adding engine: copilot to the frontmatter for clarity and consistency with other similar workflows (e.g., daily-cli-performance.md). While Copilot is the default engine and will be used automatically, explicitly specifying it makes the configuration more maintainable and self-documenting, especially since the PR description explicitly mentions "Engine: Copilot".

Suggested change

description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI

description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI

engine: copilot

Initial plan

5e6269d

Copilot AI assigned Copilot and pelikhan Feb 6, 2026

Copilot started work on behalf of pelikhan February 6, 2026 15:58 View session

Copilot AI and others added 2 commits February 6, 2026 16:03

Initial plan for daily CLI tools exploratory testing workflow

50e99e0

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Add daily CLI tools exploratory testing workflow

95a17da

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add exploratory testing for audit logs and compile tools~~ Add daily exploratory testing workflow for CLI tools Feb 6, 2026

Copilot AI requested a review from pelikhan February 6, 2026 16:07

Copilot finished work on behalf of pelikhan February 6, 2026 16:07

pelikhan approved these changes Feb 6, 2026

View reviewed changes

pelikhan marked this pull request as ready for review February 6, 2026 16:15

Copilot AI review requested due to automatic review settings February 6, 2026 16:15

Copilot started reviewing on behalf of pelikhan February 6, 2026 16:15 View session

github-actions bot mentioned this pull request Feb 6, 2026

📰 Repository Chronicle - Team Leverages AI Tools in 24-Hour Development Blitz #14172

Open

Remove agentic-workflow-workbench permissions

692bf1f

Removed permissions for the 'github/agentic-workflow-workbench' repository in the devcontainer configuration.

pelikhan merged commit 88dda85 into main Feb 6, 2026
48 checks passed

pelikhan deleted the copilot/exploratory-testing-audit-logs-tools branch February 6, 2026 16:16

Copilot AI reviewed Feb 6, 2026

View reviewed changes

This was referenced Feb 6, 2026

Smoke Test: Claude - 21758499044 #14188

Closed

🌱 Daily Team Evolution Insights - February 7, 2026 #14261

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add daily exploratory testing workflow for CLI tools#14168

Add daily exploratory testing workflow for CLI tools#14168
pelikhan merged 4 commits intomainfrom
copilot/exploratory-testing-audit-logs-tools

Copilot AI commented Feb 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,669 @@
		---
		description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI

	description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI
	description: Daily exploratory testing of audit, logs, and compile tools in gh-aw CLI
	engine: copilot

Conversation

Copilot AI commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Workflow Configuration

Testing Coverage

Implementation Notes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 6, 2026 •

edited

Loading