Skip to content

[audit] 🚨 CRITICAL: Audit Workflow Completely Broken - Two Blocking Issues (Recurring) #1354

@github-actions

Description

@github-actions

🔍 Agentic Workflow Audit Report - 2025-10-08 (Run #2)

⚠️ CRITICAL STATUS: AUDIT COMPLETELY BLOCKED

This is the SECOND audit attempt today. Both have failed due to critical configuration issues.

Audit Summary

  • Period: Last 24 hours (attempted)
  • Runs Analyzed: 0
  • Workflows Active: Unknown (cannot collect data)
  • Success Rate: 0% (audit itself is failing)
  • Critical Issues Found: 2 (1 recurring, 1 newly discovered)
  • Audit Status: BLOCKED - Cannot perform any audit functions

🚨 Critical Findings

Issue #1: Workflow Permissions Stripped During Compilation (RECURRING)

Status: ⛔ STILL BROKEN - This was reported in audit run #3 (18329959236) earlier today and remains unfixed.

Problem: The audit workflow is compiled with empty permissions despite specifying required permissions in the source .md file.

Evidence:

  • Source file (.github/workflows/audit-workflows.md lines 6-8):

    permissions:
      contents: read
      actions: read
  • Compiled file (.github/workflows/audit-workflows.lock.yml line 16):

    permissions: {}

Impact:

  • Workflow has no access to GitHub Actions API
  • Cannot download workflow run logs
  • Cannot analyze workflow runs
  • GITHUB_TOKEN is not available to the workflow
  • The audit agent is completely non-functional

Root Cause: The gh-aw compile command is stripping permissions from the source .md file during compilation.

Affected Files:

  • .github/workflows/audit-workflows.md
  • .github/workflows/audit-workflows.lock.yml

Recurrence History:


Issue #2: MCP Server Misconfiguration - gh-aw Not Installed as Extension (NEW)

Status: 🆕 NEWLY DISCOVERED

Problem: The gh-aw MCP server is running but cannot execute any commands because gh-aw is not installed as a GitHub CLI extension.

Evidence:

$ ps aux | grep mcp
runner  9993  ./gh-aw mcp-server    # ✅ Server is running

$ mcp__gh-aw__logs --start-date -1d
Error: exit status 1
Output: unknown command "aw" for "gh"

Did you mean this?
    api
    co
    pr

Impact:

  • All MCP tools from gh-aw server are non-functional:
    • mcp__gh-aw__logs
    • mcp__gh-aw__compile
    • mcp__gh-aw__audit
    • mcp__gh-aw__status
  • Audit workflow cannot collect any data
  • Cannot perform any log analysis

Root Cause:

The setup configuration in .github/workflows/shared/gh-aw-mcp.md includes:

steps:
  - name: Build gh-aw CLI
    run: make build
  - name: Install binary as 'gh-aw'
    run: make install

The make install target (Makefile lines 166-168) runs:

install: build
	gh extension remove gh-aw || true
	gh extension install .

This requires GitHub CLI authentication, but:

  1. The setup steps run BEFORE the agentic execution job
  2. The setup steps don't have access to GITHUB_TOKEN
  3. Even though the MCP server configuration specifies GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}", this doesn't help the setup phase

Additional Context:

The MCP server code (pkg/cli/mcp_server.go) calls commands like:

cmd := exec.CommandContext(ctx, "gh", "aw", "logs", ...)

This requires gh aw to work as a CLI extension, which requires successful gh extension install, which requires authentication.

Affected Files:

  • .github/workflows/shared/gh-aw-mcp.md
  • pkg/cli/mcp_server.go
  • Makefile

Error Analysis

Compilation Errors

Error Pattern Occurrences Severity Status
permissions_stripped_during_compilation 2 CRITICAL RECURRING

Authentication Errors

Error Pattern Occurrences Severity Root Cause
gh_cli_no_token 2 HIGH Empty workflow permissions

MCP Server Errors

Error Pattern Occurrences Severity Root Cause
gh_extension_not_installed 1 CRITICAL Setup requires auth before token available

MCP Server Status

Server Running PID Functional Reason
gh-aw ✅ Yes 9993 ❌ No Extension not installed
github ✅ Yes 10612 ❓ Unknown -
safe-outputs ✅ Yes 10015 ✅ Yes Working correctly

🔧 Recommendations

1. Fix Workflow Permission Compilation (CRITICAL - IMMEDIATE)

Priority: 🔴 CRITICAL
Urgency: IMMEDIATE - Blocking all audit functionality for 14+ hours

Action: Fix the gh-aw compile command to preserve permissions from source .md files.

Investigation Needed:

  • Why is permissions: being set to {} during compilation?
  • Is this affecting other workflows beyond audit-workflows?
  • Is there a bug in the YAML frontmatter parsing/compilation logic?

Testing Required:

  1. Verify that .md file has permissions defined
  2. Run gh aw compile
  3. Verify .lock.yml preserves those permissions
  4. Add automated test to prevent regression

Files to Investigate:

  • Compilation logic in the gh-aw codebase
  • YAML frontmatter parser
  • Template generation code

2. Fix MCP Server Setup for gh-aw Extension (CRITICAL)

Priority: 🔴 CRITICAL
Urgency: HIGH - Required for audit functionality

Action: Fix the MCP server setup so gh-aw can be used as a GitHub CLI extension without authentication issues.

Proposed Solutions (choose one or combine):

Option A: Add GITHUB_TOKEN to Setup Steps (Recommended)

Modify .github/workflows/shared/gh-aw-mcp.md:

steps:
  - name: Set up Go
    uses: actions/setup-go@v5
    with:
      go-version-file: go.mod
      cache: true
  - name: Install dependencies
    run: make deps-dev
  - name: Build gh-aw CLI
    run: make build
  - name: Install binary as 'gh-aw'
    env:
      GH_TOKEN: ${{ github.token }}
    run: make install

Option B: Modify MCP Server to Call Binary Directly

Modify pkg/cli/mcp_server.go to call ./gh-aw instead of gh aw:

// Change from:
cmd := exec.CommandContext(ctx, "gh", cmdArgs...)

// To:
ghAwPath := os.Getenv("GH_AW_BINARY_PATH") // or hardcode "./gh-aw"
cmdArgs = append([]string{cmdArgs[1:]...}) // Remove "aw" from args
cmd := exec.CommandContext(ctx, ghAwPath, cmdArgs...)

Option C: Skip gh Extension Install

Create a symlink or modify PATH without requiring gh extension install:

  - name: Install binary to PATH
    run: |
      mkdir -p ~/.local/bin
      cp ./gh-aw ~/.local/bin/
      echo "$HOME/.local/bin" >> $GITHUB_PATH

Recommendation: Use Option A as it maintains compatibility with existing gh extension patterns while fixing the authentication issue.


3. Test All Other Workflows for Permission Issues (HIGH)

Priority: 🟠 HIGH
Action: Audit all compiled .lock.yml files for empty permissions

Command:

for file in .github/workflows/*.lock.yml; do
  echo "=== $file ==="
  grep -A 2 "^permissions:" "$file"
done

Expected: Identify if this is a systemic issue affecting multiple workflows or isolated to audit-workflows.


4. Add Compilation Validation (MEDIUM)

Priority: 🟡 MEDIUM
Action: Add automated validation to ensure compiled workflows preserve permissions

Implementation Options:

  1. Pre-commit hook: Validate before committing .lock.yml files
  2. CI check: Add workflow that validates compiled files
  3. Built-in validation: Add --validate-permissions flag to gh aw compile

Example CI Check:

- name: Validate compiled workflows preserve permissions
  run: |
    ./scripts/validate-compiled-permissions.sh

Historical Context

Comparison to Previous Audit (Run #3 - 18329959236)

Metric Run #3 Run #4 Change
Critical Issues 1 2 +1 new issue
Permissions Issue Discovered Still broken ⛔ UNRESOLVED
MCP Server Issue Not checked Discovered 🆕 NEW
Audit Successful No No ❌ Still blocked

Audit History

This repository has had 2 audit attempts, both failed:

  1. Run Add workflow: githubnext/agentics/weekly-research #3 (18329959236) - 2025-10-08 00:31 UTC

    • Status: BLOCKED
    • Issue: Permissions stripped during compilation
    • Result: Created issue report
  2. Run Add workflow: githubnext/agentics/weekly-research #4 (18347215009) - 2025-10-08 14:02 UTC (this run)

    • Status: BLOCKED
    • Issues: Same permission issue + MCP server misconfiguration
    • Result: This report

Time Between Attempts: ~13.5 hours
Issue Resolution Status: ⛔ UNRESOLVED


Next Steps

  • URGENT: Fix permissions compilation bug in gh-aw
  • URGENT: Fix MCP server setup to install gh-aw extension properly
  • Test fix by manually compiling audit-workflows.md and checking permissions
  • Re-run audit workflow to verify both issues are resolved
  • Add regression tests for permission preservation
  • Document the root cause once identified
  • Check if other workflows are affected

Impact Assessment

Severity: 🔴 CRITICAL

Current Impact:

  • Audit workflow is 100% non-functional - Cannot perform its primary function
  • No visibility into agentic workflow health - Cannot detect issues in other workflows
  • Manual investigation required - Team has no automated monitoring
  • Recurring issue - Problem persists across multiple runs

Business Impact:

  • No automated oversight of agentic workflows
  • Potential issues in other workflows may go undetected
  • Manual effort required for workflow monitoring
  • Loss of confidence in agentic workflow infrastructure

Timeline:

  • Issue first detected: 2025-10-08 00:31 UTC
  • Time in broken state: 13+ hours
  • Expected resolution: Requires immediate developer attention

Metadata

  • Audit Date: 2025-10-08
  • Run ID: 18347215009
  • Run Number: 4 (Audit Add workflow: githubnext/agentics/weekly-research #2)
  • Triggered By: pelikhan
  • Repository: githubnext/gh-aw
  • Audit Agent: Claude (Sonnet 4.5)
  • Cache Memory Updated: ✅ Yes
    • /tmp/cache-memory/audits/2025-10-08-run2.json
    • /tmp/cache-memory/audits/index.json
    • /tmp/cache-memory/patterns/errors.json

Note: This issue report was automatically generated by the Agentic Workflow Audit Agent. The audit agent itself is currently non-functional due to the issues described above, but was able to analyze the configuration and identify these critical problems.

AI generated by Agentic Workflow Audit Agent

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions