Skip to content

[smoke-detector] 🚨 CRITICAL RECURRING: GenAIScript Invalid Model (gpt-4.1) - 3rd Occurrence #2204

@github-actions

Description

@github-actions

🚨 CRITICAL RECURRING FAILURE - 3rd Occurrence

Summary

The Smoke GenAIScript workflow has FAILED AGAIN with the same root cause that was previously reported in #2157. This is the 3rd occurrence of this critical issue in less than 24 hours. Issue #2157 was closed as "not_planned" but the underlying configuration problem was never fixed, resulting in continued failures.

Failure Details

Recurrence Timeline

Occurrence Run ID Date Status
1st 18727962258 2025-10-22 19:45 UTC Issue #2157 created
2nd 18733557489 2025-10-23 00:19 UTC #2157 closed as "not_planned"
3rd 18739169072 (current) 2025-10-23 06:07 UTC This investigation

Root Cause (UNCHANGED)

The problem remains exactly the same as documented in #2157:

The GenAIScript configuration uses an invalid OpenAI model name:

# .github/workflows/shared/genaiscript.md:6
GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"

gpt-4.1 does not exist in OpenAI's model catalog. Valid models include:

  • gpt-4o (recommended)
  • gpt-4-turbo
  • gpt-4
  • gpt-3.5-turbo

Error Chain

  1. GenAIScript attempts to use model openai:gpt-4.1
  2. OpenAI API rejects the invalid model name
  3. GenAIScript receives undefined/null response
  4. GenAIScript crashes: TypeError: Cannot read properties of undefined (reading 'text')
  5. Detection job fails with exit code 255

Stack Trace

2025-10-23T06:09:52.672Z genaiscript:error {
  name: 'TypeError',
  message: "Cannot read properties of undefined (reading 'text')",
  stack: "TypeError: Cannot read properties of undefined (reading 'text')\n" +
    '    at githubActionSetOutputs ((redacted))\n' +
    '    at async Command.runScriptWithExitCode ((redacted))'
}

Failed Jobs and Errors

Job Execution Summary

  1. activation - succeeded (2s)
  2. agent - succeeded (1.4m) - Agent completed successfully
  3. detection - FAILED (55s) - Threat detection crashed
  4. create_issue - succeeded (7s)
  5. ⏭️ missing_tool - skipped

Impact Assessment

Severity: 🔴 CRITICAL

  • All scheduled smoke tests for GenAIScript are failing
  • Threat detection is NOT running (security implications)
  • False confidence in system health due to skipped validations
  • 3 consecutive failures in ~10 hours

Urgency: 🔴 IMMEDIATE

  • Simple one-line configuration fix
  • Blocking critical security and quality checks
  • Issue will continue to recur daily via scheduled runs

Scope:

  • Affects: All workflows using shared/genaiscript.md
  • Frequency: Every scheduled smoke test run (multiple times per day)
  • Duration: Ongoing since 2025-10-22 19:45 UTC (>10 hours)

Why This Needs Immediate Attention

  1. Security Risk: Threat detection is disabled due to these failures
  2. False Negatives: Team may assume smoke tests are passing when they're actually failing
  3. Resource Waste: Every scheduled run consumes CI minutes while producing no value
  4. Pattern Established: Without intervention, this will fail indefinitely
  5. Simple Fix: One-line configuration change to resolve

Recommended Actions

🔴 IMMEDIATE - Fix Configuration (5 minutes)

Update .github/workflows/shared/genaiscript.md line 6:

- GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"
+ GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4o"

🟡 SHORT-TERM - Prevent Recurrence

  1. Add Model Validation - Create a pre-flight check that validates model names before execution
  2. Update Documentation - Document valid model names in configuration files
  3. Monitor Pattern - Track if this issue pattern appears in other workflows

🟢 LONG-TERM - Systemic Improvements

  1. Schema Validation - Add JSON schema validation for workflow configurations
  2. Better Error Messages - Work with GenAIScript team to improve error handling
  3. Automated Alerts - Configure alerts for recurring failure patterns

Investigation Findings

Configuration Location

  • File: .github/workflows/shared/genaiscript.md
  • Line: 6
  • Variable: GH_AW_AGENT_MODEL_VERSION
  • Current Value: "openai:gpt-4.1"
  • Correct Value: "openai:gpt-4o"

Historical Pattern

{
  "pattern_signature": "GENAISCRIPT_INVALID_MODEL",
  "first_occurrence": "2025-10-22T19:45:52Z",
  "recurrence_count": 3,
  "days_recurring": 1,
  "previous_run_ids": [18727962258, 18733557489, 18739169072],
  "is_flaky": false,
  "external_dependency": "OpenAI API"
}

Previous Issue Context

Issue #2157 documented this exact problem but was closed as "not_planned" without implementing a fix. The investigation was thorough and accurate, but the root cause was not addressed, leading to these continued failures.

This issue reopens the discussion with emphasis on:

  • The recurring nature (3 occurrences)
  • The security implications (disabled threat detection)
  • The simplicity of the fix (one line change)
  • The cost of inaction (ongoing CI failures)

Reproduction Steps

  1. Configure GenAIScript with model: openai:gpt-4.1
  2. Run any GenAIScript-based workflow
  3. Observe failure when invalid model is used
  4. See TypeError when accessing undefined result

Related Issues


Investigation Metadata

  • Investigator: Smoke Detector (Failure Investigation Agent)
  • Investigation Run: #18739232833
  • Pattern: GENAISCRIPT_INVALID_MODEL (3rd occurrence)
  • Investigation Record: /tmp/gh-aw/cache-memory/investigations/2025-10-23-18739169072.json
  • Created: 2025-10-23T06:12:00Z

AI generated by Smoke Detector - Smoke Test Failure Investigator

AI generated by Smoke Detector - Smoke Test Failure Investigator

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions