[smoke-detector] 🔍 Smoke Test Investigation - Smoke Codex Run #49: Agent Output Artifact Missing in Staged Mode

# 🔍 Smoke Test Investigation - Run #49

## Summary
The Smoke Codex workflow failed because the `create_issue` job could not find the expected `agent_output.json` artifact. The agent job completed successfully in staged mode, but the artifact upload step failed with a warning that no files were found at the expected path `/tmp/gh-aw/safe-outputs/outputs.jsonl`.

## Failure Details
- **Run**: [18840299097]((redacted))
- **Run Number**: 49
- **Commit**: b88a6471f071998d793a208693f690cea9a9ebb3
- **Branch**: main
- **Trigger**: schedule
- **Duration**: 2.9 minutes
- **Failed Jobs**: create_issue (4s)
- **Workflow**: Smoke Codex

## Root Cause Analysis

### Primary Issue
The workflow ran in **staged mode** (`GH_AW_SAFE_OUTPUTS_STAGED=true`), where the agent is expected to:
1. ✅ Complete its task successfully
2. ✅ Use safe-outputs tools or generate output
3. ✅ Create `/tmp/gh-aw/safe-outputs/outputs.jsonl`
4. ❌ Have the artifact uploaded for downstream jobs

However, the artifact upload step could not find any files to upload, causing the `create_issue` job to fail when attempting to download the non-existent artifact.

### Error Chain

**1. Artifact Upload Warning** (agent job)
```
##[warning]No files were found with the provided path: /tmp/gh-aw/safe-outputs/outputs.jsonl
No artifacts will be uploaded.
```

**2. Artifact Download Failure** (create_issue job)
```
##[error]Unable to download artifact(s): Artifact not found for name: agent_output.json
Please ensure that your artifact is not expired and the artifact was uploaded using a compatible version of toolkit/upload-artifact.
```

**3. File Not Found** (create_issue job)
```
##[error]Error reading agent output file: ENOENT: no such file or directory, open '/tmp/gh-aw/safe-outputs/agent_output.json'
```

### Why Did This Happen?

This failure indicates one of the following scenarios:

1. **Agent didn't use safe-outputs tools**: The Codex agent may have completed successfully but didn't call the `safe_outputs_create_issue` tool, similar to issue #2307 (GenAIScript)

2. **Safe-outputs file not created**: Even if the agent intended to use safe-outputs, the file `/tmp/gh-aw/safe-outputs/outputs.jsonl` was never written

3. **Staged mode behavior**: In staged mode, the agent may behave differently and not create the expected output files

4. **Workflow configuration issue**: The artifact upload step expects `outputs.jsonl`, but the create_issue job looks for `agent_output.json` - there may be a transformation step that failed

## Failed Jobs and Errors

### Job Sequence
1. ✅ **activation** - succeeded (4s)
2. ✅ **agent** - succeeded (1.3m)
3. ✅ **detection** - succeeded (24s)
4. ❌ **create_issue** - failed (4s)
5. ⏭️ **missing_tool** - skipped

### Key Observations
- Agent job **succeeded** (1.3m runtime)
- Detection job **succeeded** (24s) - this job also depends on artifacts but succeeded
- Only create_issue job failed
- Workflow ran in **staged mode** where issues are not actually created but previewed

## Investigation Findings

### Agent Execution
- **Status**: Success
- **Duration**: 1.3 minutes
- **Turns**: 1
- **Error Count**: 1
- **Warning Count**: 0

### Staged Mode Configuration
```bash
GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safe-outputs/agent_output.json
```

In staged mode, the workflow should:
- Run the agent normally
- Generate output to preview
- Not actually create GitHub issues
- Still upload artifacts for the create_issue job to preview

### Artifact Details
The workflow attempts to upload artifacts from the agent job:
- **Expected source path**: `/tmp/gh-aw/safe-outputs/outputs.jsonl`
- **Artifact name**: `agent_output.json`
- **Result**: No files found, no artifact uploaded

## Recommended Actions

### High Priority

- [ ] **Investigate agent logs for this specific run**
  ```bash
  # Download and examine the agent-stdio.log
  # Look for:
  # - Was the safe-outputs MCP server loaded?
  # - Did the agent attempt to use safe_outputs_create_issue?
  # - Were there any errors creating outputs.jsonl?
  ```

- [ ] **Check safe-outputs MCP configuration for Codex**
  - Verify the MCP server is properly configured in the Codex workflow
  - Confirm the safe-outputs tools are available to the Codex agent
  - Check if there are Codex-specific issues with MCP integration

- [ ] **Add validation step before artifact upload**
  ```yaml
  - name: Validate Safe Outputs Created
    run: |
      if [ -f "/tmp/gh-aw/safe-outputs/outputs.jsonl" ]; then
        echo "✓ outputs.jsonl created"
      else
        echo "⚠️ outputs.jsonl not found - agent may not have used safe-outputs tools"
        echo "This is expected if the agent task didn't require creating issues"
        # Don't fail, just warn
      fi
  ```

- [ ] **Make artifact upload conditional**
  ```yaml
  - name: Upload Safe Outputs
    if: hashFiles('/tmp/gh-aw/safe-outputs/outputs.jsonl') != ''
    uses: actions/upload-artifact@v4
    with:
      name: agent_output.json
      path: /tmp/gh-aw/safe-outputs/outputs.jsonl
  ```

### Medium Priority

- [ ] **Enhance agent prompt for staged mode**
  ```
  You are running in STAGED MODE. You MUST still use the safe_outputs_create_issue 
  tool to create your output, even though the issue won't actually be created. 
  This is required for the workflow to complete successfully.
  ```

- [ ] **Make downstream jobs handle missing artifacts gracefully**
  ```yaml
  create_issue:
    needs: [agent, detection]
    runs-on: ubuntu-latest
    continue-on-error: true  # Don't fail workflow if artifact missing
    steps:
      - name: Download artifact
        uses: actions/download-artifact@v4
        with:
          name: agent_output.json
        continue-on-error: true
      
      - name: Check artifact exists
        run: |
          if [ ! -f "$GH_AW_AGENT_OUTPUT" ]; then
            echo "No agent output artifact found - agent may not have required creating issues"
            exit 0
          fi
  ```

- [ ] **Add detection job check**
  - The detection job succeeded, so understand why it didn't catch this issue
  - May need to enhance detection logic to identify missing output files

### Low Priority

- [ ] **Compare with successful Codex runs**
  - Look at recent successful Smoke Codex workflow runs
  - Identify what's different about this failure
  - Check if this is a new pattern or recurring issue

- [ ] **Document expected behavior in staged mode**
  - Clarify whether agents MUST create safe-outputs in staged mode
  - Define clear success criteria for staged mode workflows
  - Add troubleshooting guide for artifact-related failures

## Prevention Strategies

1. **Explicit Output Requirements**
   - Make it clear in prompts when safe-outputs tools are required
   - Add validation that verifies expected outputs were created
   - Fail fast if required outputs are missing

2. **Graceful Degradation**
   - Don't fail the entire workflow if artifacts are missing
   - Distinguish between "agent failed" vs "agent succeeded but no output needed"
   - Add conditional logic for optional outputs

3. **Better Error Messages**
   - When artifact download fails, explain possible causes
   - Provide troubleshooting steps
   - Link to documentation about safe-outputs and staged mode

4. **Monitoring and Alerts**
   - Track artifact upload success rates
   - Alert on patterns of missing artifacts
   - Monitor staged mode workflow health

## Historical Context

### Similar Past Failures

| Issue | Engine | Pattern | Status |
|-------|--------|---------|--------|
| #2307 | GenAIScript | Agent doesn't use safe-outputs | Closed |
| #2280 | Copilot | MCP config malformed | Closed |
| #2143 | OpenCode | Agent doesn't use safe-outputs | Closed |

This appears to be a **recurring pattern** across different engines where agents complete successfully but don't use safe-outputs tools, particularly in staged mode.

### Pattern Classification
- **Pattern ID**: `CODEX_AGENT_NO_ARTIFACT_STAGED_MODE`
- **Category**: Workflow Configuration - Staged Mode Issue
- **Severity**: Medium
- **Is Flaky**: Unknown (need more data)
- **First Occurrence**: This investigation (2025-10-27)

## Technical Details

### Workflow Configuration
```yaml
tools:
  safe-outputs:
    staged: true
    expected-outputs:
      create_issue:
        min: 1
        max: 1
```

### Artifact Upload Configuration
```yaml
- name: Upload Safe Outputs
  uses: actions/upload-artifact@v4
  with:
    name: agent_output.json
    path: /tmp/gh-aw/safe-outputs/outputs.jsonl
```

### Artifact Download Configuration
```yaml
- name: Download artifact
  uses: actions/download-artifact@v4
  with:
    name: agent_output.json
    path: /tmp/gh-aw/safe-outputs/
```

### Environment Variables
```
GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safe-outputs/agent_output.json
```

## Related Issues
- #2307 - GenAIScript agent doesn't use safe-outputs (closed)
- #2280 - Copilot safe-outputs MCP crashes (closed)
- #2143 - OpenCode agent doesn't use safe-outputs (closed)

## Next Steps

1. **Immediate**: Download and analyze agent-stdio.log from run 18840299097
2. **Short-term**: Add validation and conditional logic to prevent this failure mode
3. **Long-term**: Enhance agent prompts and MCP tool reliability across all engines

---

**Investigation Metadata:**
- **Investigator**: Smoke Detector (automated investigator)
- **Investigation Run**: [18840378169]((redacted))
- **Pattern Database**: `/tmp/gh-aw/cache-memory/patterns/codex_no_artifact_staged.json`
- **Investigation Record**: `/tmp/gh-aw/cache-memory/investigations/2025-10-27-18840299097.json`




> AI generated by [Smoke Detector - Smoke Test Failure Investigator](https://github.com/githubnext/gh-aw/actions/runs/18840378169)

Issue	Engine	Pattern	Status
#2307	GenAIScript	Agent doesn't use safe-outputs	Closed
#2280	Copilot	MCP config malformed	Closed
#2143	OpenCode	Agent doesn't use safe-outputs	Closed

[smoke-detector] 🔍 Smoke Test Investigation - Smoke Codex Run #49: Agent Output Artifact Missing in Staged Mode #2604

Description

🔍 Smoke Test Investigation - Run #49

Summary

Failure Details

Root Cause Analysis

Primary Issue

Error Chain

Why Did This Happen?

Failed Jobs and Errors

Job Sequence

Key Observations

Investigation Findings

Agent Execution

Staged Mode Configuration

Artifact Details

Recommended Actions

High Priority

Medium Priority

Low Priority

Prevention Strategies

Historical Context

Similar Past Failures

Pattern Classification

Technical Details

Workflow Configuration

Artifact Upload Configuration

Artifact Download Configuration

Environment Variables

Related Issues

Next Steps

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions