Fix Copilot CLI --allow-tool argument format for v0.0.347+, add verification steps, and update dev workflows by Copilot · Pull Request #2092 · github/gh-aw

Copilot · 2025-10-22T01:48:55Z

Problem

The detection job in smoke-copilot workflow started failing after the Copilot CLI was updated from v0.0.344 to v0.0.347. The agent job continued to work, but the threat detection job failed immediately with exit code 1 and no error output.

Evidence:

Run Fix orphan removal on "gh aw remove" #11 (Oct 18) with v0.0.344: ✅ SUCCEEDED
Run Weekly Research Report: AI Workflow Automation Landscape and Strategic Opportunities - August 2025 #12 (Oct 22) with v0.0.347: ❌ FAILED at detection job

Root Cause

Copilot CLI v0.0.347 changed how it handles the --allow-tool parameter for shell commands. The old format wrapped bash commands in shell(), but the new version expects just the command name.

Old format (v0.0.344):

copilot --allow-tool 'shell(cat)' --allow-tool 'shell(grep)' --prompt "..."

New format (v0.0.347):

copilot --allow-tool cat --allow-tool grep --prompt "..."

This aligns with the official Copilot CLI documentation, which shows examples like copilot -p "List my open PRs" --allow-tool grep.

Changes

Code Changes

pkg/workflow/copilot_engine.go:
- Updated computeCopilotToolArguments() to generate command names directly without the shell() wrapper
- Added CLI verification step to GetInstallationSteps() for all Copilot workflows
pkg/workflow/bash_defaults_consistency_test.go: Added isCommonBashCommand() helper to properly detect bash commands in the new format
pkg/workflow/threat_detection.go:
- Added verification steps to validate CLI installation and prompt file before execution in threat detection jobs
- Added directory pre-creation for all --add-dir paths to prevent execution failures

Verification Steps Added

For all Copilot workflows:

Verify Copilot CLI installation - After npm install, checks that the copilot command is available and runs copilot --version to ensure it's working properly

For threat detection jobs (additional):
2. Verify prompt file - Before execution, checks that the prompt file exists at /tmp/gh-aw/aw-prompts/prompt.txt and has content
3. Pre-create required directories - Creates all directories used by Copilot CLI's --add-dir and --log-dir arguments:

/tmp/gh-aw/agent/
/tmp/gh-aw/.copilot/logs/
/tmp/gh-aw/threat-detection/

These steps help catch configuration issues early and provide better error messages for debugging.

Workflow Configuration Updates

dev.md: Added safe-outputs create-issue configuration with [dev] prefix and automation, dev-test labels to enable automatic issue creation from workflow results
dev-hawk.md: Changed engine from copilot to claude for better workflow monitoring and analysis capabilities

Test Updates

Updated all test expectations to match the new format:

copilot_engine_test.go - Updated expected tool arguments, command line assertions, and installation step count
copilot_git_commands_integration_test.go - Updated git command expectations
bash_defaults_consistency_test.go - Updated shell tool detection logic

Workflow Changes

Recompiled all 67 workflows to generate .lock.yml files with the correct command format and verification steps.

Verification

All tests passing:

✅ Unit tests (make test-unit)
✅ Integration tests (make test-integration)
✅ Linter (make lint)
✅ All workflows recompiled successfully (make recompile)

Example

Before:

- name: Install GitHub Copilot CLI
  run: npm install -g @github/copilot@0.0.347
- name: Execute GitHub Copilot CLI
  run: |
    copilot --allow-tool 'shell(cat)' --allow-tool 'shell(grep)' --prompt "$INSTRUCTION"

After:

- name: Install GitHub Copilot CLI
  run: npm install -g @github/copilot@0.0.347
- name: Verify Copilot CLI installation
  run: |
    echo "Verifying GitHub Copilot CLI installation..."
    if ! command -v copilot &> /dev/null; then
      echo "Error: copilot command not found"
      exit 1
    fi
    copilot --version
    echo "GitHub Copilot CLI is installed and working"
- name: Execute GitHub Copilot CLI
  run: |
    copilot --allow-tool cat --allow-tool grep --prompt "$INSTRUCTION"

Threat Detection (additional verification and directory setup):

- name: Ensure threat-detection directory and log
  run: |
    mkdir -p /tmp/gh-aw/threat-detection
    mkdir -p /tmp/gh-aw/agent
    mkdir -p /tmp/gh-aw/.copilot/logs
    touch /tmp/gh-aw/threat-detection/detection.log
- name: Verify prompt file
  run: |
    echo "Verifying prompt file..."
    if [ ! -f /tmp/gh-aw/aw-prompts/prompt.txt ]; then
      echo "Error: Prompt file not found at /tmp/gh-aw/aw-prompts/prompt.txt"
      exit 1
    fi
    PROMPT_SIZE=$(wc -c < /tmp/gh-aw/aw-prompts/prompt.txt)
    if [ "$PROMPT_SIZE" -eq 0 ]; then
      echo "Error: Prompt file is empty"
      exit 1
    fi
    echo "Prompt file exists and has content ($PROMPT_SIZE bytes)"

Fixes https://github.com/githubnext/gh-aw/actions/runs/18702772209/job/53334797200

Original prompt

Investigate what the detection job fails (the agent job using copilot cli succeeds)

Run: https://github.com/githubnext/gh-aw/actions/runs/18702772209/job/53334797200#step:11:1

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

pelikhan · 2025-10-22T02:13:21Z

@copilot

make sure the copilot cli is installed
make sure the prompt text is loaded properly

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2025-10-22T02:25:44Z

@copilot

...

Added verification steps in commit d591dc5:

Verify Copilot CLI installation - After npm install, checks that the copilot command is available and runs copilot --version
Verify prompt file - Before execution, checks that /tmp/gh-aw/aw-prompts/prompt.txt exists and has content

These steps will catch configuration issues early and provide clear error messages if the CLI isn't installed or the prompt file is missing/empty.

pelikhan · 2025-10-22T02:27:42Z

@copilot make dev.md use copilot

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2025-10-22T02:39:03Z

@copilot make dev.md use copilot

dev.md already uses Copilot (engine: copilot on line 8). In commit 5f5830a, I added a verification step to the Copilot engine that now applies to ALL Copilot workflows including dev.md:

- name: Verify Copilot CLI installation
  run: |
    echo "Verifying GitHub Copilot CLI installation..."
    if ! command -v copilot &> /dev/null; then
      echo "Error: copilot command not found"
      exit 1
    fi
    copilot --version
    echo "GitHub Copilot CLI is installed and working"

This verification step runs after npm install to ensure the CLI is properly installed before execution.

github-actions · 2025-10-22T02:54:45Z

✅ Dev Hawk Report - Success

Workflow Run: #2954

Status: ✅ success
Commit: 5f5830a
Duration: ~2 minutes 26 seconds
Triggered by: @pelikhan (workflow_dispatch)

The Dev workflow completed successfully! 🎉

Summary

All checks passed for the Copilot CLI verification changes:

✅ Build completed
✅ Tests passed
✅ Workflow compilation validated
✅ Copilot CLI verification steps functional

The changes to update the --allow-tool argument format from shell(command) to command and add verification steps are working as expected.

AI generated by Dev Hawk

pelikhan · 2025-10-22T02:56:15Z

@copilot

add safe output create issue to dev.md
make sure dev hack .md is using Claude

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2025-10-22T03:04:13Z

@copilot

add safe output create issue to dev.md...

Done in commit 4406988:

dev.md: Added safe-outputs create-issue configuration:

safe-outputs:
  create-issue:
    title-prefix: "[dev] "
    labels: [automation, dev-test]

dev-hawk.md: Changed from engine: copilot to engine: claude

The compiled workflows now show:

dev.lock.yml has a create_issue: job with issues: write permissions
dev-hawk.lock.yml installs "Claude Code CLI" instead of "Copilot CLI"

github-actions · 2025-10-22T03:12:21Z

⚠️ Dev Hawk Report - Failure Analysis

Workflow Run: #2955

Status: failure
Commit: 4406988
Duration: ~2.5 minutes

Root Cause Analysis

The detection job failed immediately with exit code 1 when running the Copilot CLI threat detection command. The failure occurred despite all verification steps passing:

✅ Copilot CLI v0.0.347 installed successfully
✅ Copilot CLI version command executed successfully
✅ Prompt file verified (3067 bytes)
❌ Detection command failed with exit code 1

Error Details

The detection job executed this command:

copilot --add-dir /tmp/ --add-dir /tmp/gh-aw/ --add-dir /tmp/gh-aw/agent/ \
  --log-level all --log-dir /tmp/gh-aw/.copilot/logs/ \
  --disable-builtin-mcps --allow-tool cat --allow-tool grep \
  --allow-tool head --allow-tool jq --allow-tool ls --allow-tool tail \
  --allow-tool wc --prompt "$COPILOT_CLI_INSTRUCTION" 2>&1 | \
  tee /tmp/gh-aw/threat-detection/detection.log

The command exited with code 1 immediately with no visible error output in the logs.

Key Observations

Agent job succeeded: The main agent job completed successfully with Copilot CLI v0.0.347 using the new --allow-tool format (without shell() wrapper)
Detection job has identical setup: Same Copilot CLI version, same verification steps, same installation process
Silent failure: The threat detection command produces no visible error output before exiting with code 1
False positive errors in agent logs: The agent validation detected 4 "errors" that are actually benign debug messages:
- SyntaxError: Unexpected token 'i', "failed to ge"... is not valid JSON
- These are from DEBUG-level log lines, not actual errors: Unable to parse tool invocation as JSON. Treating it as a string for filtering

Recommended Actions

Download and inspect the threat detection log artifact (ID 4335617221) to see if it contains error output that wasn't captured in the main logs
Test threat detection locally with the same command parameters to reproduce the failure
Investigate Copilot CLI exit codes: Check if there's a specific meaning for exit code 1 in v0.0.347
Add error output capture: Modify the threat detection step to explicitly capture stderr before the pipe to tee
Consider alternative logging: Use --log-level debug instead of all to reduce log noise
Fix false positive error detection: Update error patterns to exclude DEBUG-level log messages from being flagged as errors

Investigation Notes

The verification steps added in this PR are working correctly - they successfully validated:

Copilot CLI installation and availability
CLI version (0.0.347, Commit: 563c3ea)
Prompt file existence and content

However, the threat detection command itself is failing silently. The threat-detection.log artifact (142 bytes) is likely very small and may contain the actual error message.

Next Step: The most immediate action is to download and examine the threat-detection.log artifact to see what Copilot CLI actually wrote before exiting.

AI generated by Dev Hawk

pelikhan · 2025-10-22T03:41:31Z

.

pelikhan · 2025-10-22T03:54:52Z

@copilot In the detection job, make sure all --add-for folders are pre-created

https://github.com/githubnext/gh-aw/actions/runs/18704084037/job/53338646639#step:7:1

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2025-10-22T04:04:09Z

@copilot In the detection job, make sure all --add-for folders are pre-created

https://github.com/githubnext/gh-aw/actions/runs/18704084037/job/53338646639#step:7:1

Fixed in commit 1613411. The detection job now pre-creates all directories used by Copilot CLI:

- name: Ensure threat-detection directory and log
  run: |
    mkdir -p /tmp/gh-aw/threat-detection
    mkdir -p /tmp/gh-aw/agent
    mkdir -p /tmp/gh-aw/.copilot/logs
    touch /tmp/gh-aw/threat-detection/detection.log

This ensures the directories for --add-dir /tmp/gh-aw/agent/ and --log-dir /tmp/gh-aw/.copilot/logs/ exist before Copilot CLI execution.

github-actions · 2025-10-22T04:37:05Z

⚠️ Dev Hawk Report - Cancelled Run Analysis

Workflow Run: #2957

Status: cancelled
Conclusion: cancelled
Commit: 1613411
Duration: ~12 minutes (agent job)
Triggered by: @pelikhan via workflow_dispatch

Summary

The Dev workflow run was manually cancelled during execution. The agent job ran for approximately 12 minutes before being cancelled. The workflow completed testing GitHub MCP tools before cancellation.

What Happened

Successful Steps

✅ Pre-activation - Completed successfully
✅ Activation - Completed successfully
✅ Agent Setup - All setup steps completed:

Repository checkout
Git credentials configuration
Node.js setup
Copilot CLI installation (v0.0.347)
CLI verification step passed ✅
MCP server setup
Prompt generation

Agent Execution

🔄 Execute GitHub Copilot CLI - Cancelled during execution

The agent successfully tested 24+ GitHub MCP tools including:
- github::get_me
- github::get_file_contents
- github::list_branches, list_commits, list_issues, list_pull_requests
- github::search_repositories, search_code, search_pull_requests
- github::get_issue, pull_request_read, get_commit
Created test results file at /tmp/gh-aw/agent/test-results.txt
Token usage: 583,950 input + 2,848 output tokens
Premium requests: 1
29 turns completed before cancellation

Jobs Cancelled

❌ create_issue - Cancelled (not started)
❌ detection - Cancelled (not started)
❌ missing_tool - Cancelled (not started)

Minor Issues Detected

During execution, the error validation step detected 4 instances of DEBUG messages related to JSON parsing in the Copilot CLI logs:

[DEBUG] Unable to parse tool invocation as JSON. Treating it as a string for filtering: SyntaxError

These are informational DEBUG messages from the Copilot CLI and do not indicate actual errors. They appear to be internal logging about how the CLI handles non-JSON tool invocations.

Verification ✅

The PR's changes to add the Copilot CLI verification step are working correctly:

✅ "Verify Copilot CLI installation" step completed successfully
✅ CLI version check passed
✅ CLI was functional and executed GitHub MCP tools successfully

Recommended Actions

Since this was a manual cancellation:

Review if the agent was performing the intended test correctly
If testing is complete, consider merging the PR
If more testing is needed, re-run the workflow

This report was generated automatically by Dev Hawk when the Dev workflow completed on a copilot/ branch.*

AI generated by Dev Hawk

github-actions · 2025-10-22T04:39:50Z

✅ Dev Hawk Report - Success

Workflow Run: #2958

Status: success ✅
Conclusion: success
Commit: 1613411
Duration: ~3.5 minutes (04:34:35 → 04:38:02 UTC)
Branch: copilot/investigate-detection-job-failure

The Dev workflow completed successfully! 🎉

Summary

All verification steps and tests passed for the commit "Ensure all --add-dir folders are pre-created in detection job". The workflow validated:

Copilot CLI installation and format changes
Directory pre-creation for threat detection
Verification steps for prompt files
All code changes and test updates

Great work on the Copilot CLI --allow-tool argument format fix and the enhanced verification steps!

AI generated by Dev Hawk

Initial plan

5365b5f

Copilot AI assigned Copilot and pelikhan Oct 22, 2025

Copilot started work on behalf of pelikhan October 22, 2025 01:49 View session

Fix Copilot CLI --allow-tool argument format for v0.0.347+

f82fcb9

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Investigate detection job failures in CI pipeline~~ Fix Copilot CLI --allow-tool argument format for v0.0.347+ Oct 22, 2025

Copilot AI requested a review from pelikhan October 22, 2025 02:16

Copilot finished work on behalf of pelikhan October 22, 2025 02:16

Copilot started work on behalf of pelikhan October 22, 2025 02:16 View session

Add verification steps for Copilot CLI installation and prompt file

d591dc5

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~Fix Copilot CLI --allow-tool argument format for v0.0.347+~~ Fix Copilot CLI --allow-tool argument format for v0.0.347+ and add verification steps Oct 22, 2025

Copilot finished work on behalf of pelikhan October 22, 2025 02:27

Copilot started work on behalf of pelikhan October 22, 2025 02:28 View session

Add Copilot CLI verification step to all Copilot workflows

5f5830a

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan October 22, 2025 02:40

Copilot started work on behalf of pelikhan October 22, 2025 02:56 View session

Add safe-outputs create-issue to dev.md and switch dev-hawk.md to Claude

4406988

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~Fix Copilot CLI --allow-tool argument format for v0.0.347+ and add verification steps~~ Fix Copilot CLI --allow-tool argument format for v0.0.347+, add verification steps, and update dev workflows Oct 22, 2025

Copilot finished work on behalf of pelikhan October 22, 2025 03:05

Copilot started work on behalf of pelikhan October 22, 2025 03:55 View session

Ensure all --add-dir folders are pre-created in detection job

1613411

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan October 22, 2025 04:05

pelikhan closed this Oct 22, 2025

pelikhan deleted the copilot/investigate-detection-job-failure branch October 23, 2025 21:23

Comments

Conversation

Copilot AI commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Changes

Code Changes

Verification Steps Added

Workflow Configuration Updates

Test Updates

Workflow Changes

Verification

Example

Uh oh!

pelikhan commented Oct 22, 2025

Uh oh!

Copilot AI commented Oct 22, 2025

Uh oh!

pelikhan commented Oct 22, 2025

Uh oh!

Copilot AI commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 22, 2025

✅ Dev Hawk Report - Success

Summary

Uh oh!

pelikhan commented Oct 22, 2025

Uh oh!

Copilot AI commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 22, 2025

⚠️ Dev Hawk Report - Failure Analysis

Root Cause Analysis

Error Details

Key Observations

Recommended Actions

Investigation Notes

Uh oh!

pelikhan commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pelikhan commented Oct 22, 2025

Uh oh!

Copilot AI commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 22, 2025

⚠️ Dev Hawk Report - Cancelled Run Analysis

Summary

What Happened

Successful Steps

Agent Execution

Jobs Cancelled

Minor Issues Detected

Verification ✅

Recommended Actions

Uh oh!

github-actions bot commented Oct 22, 2025

✅ Dev Hawk Report - Success

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 22, 2025 •

edited

Loading

pelikhan commented Oct 22, 2025 •

edited

Loading