Skip to content

refactor: standardize Codex agent execution pattern#70

Merged
arittr merged 4 commits into750f25-task-1-extract-prompt-modulefrom
750f25-task-2-codex-standardization
Nov 3, 2025
Merged

refactor: standardize Codex agent execution pattern#70
arittr merged 4 commits into750f25-task-1-extract-prompt-modulefrom
750f25-task-2-codex-standardization

Conversation

@arittr
Copy link
Owner

@arittr arittr commented Nov 3, 2025

Refactor Codex agent to use direct CLI execution with stdin (matching Claude/Gemini pattern) instead of temp file I/O, reducing complexity and improving consistency.

Changes:

  • Updated executeCommand() to use exec('codex', ['exec', '--skip-git-repo-check'], { input: prompt }) pattern
  • Removed temp file logic: _readOutput(), _cleanupTempFile() methods (~50 LOC reduction)
  • Codex agent now ~110 LOC (down from ~160 LOC)
  • Updated tests to reflect new execution pattern (no temp file mocking)
  • All three agents (Claude, Codex, Gemini) now have identical test structure

Acceptance Criteria Met:

  • ✅ Codex uses stdin pattern (no temp files)
  • ✅ File handling methods removed (~50 LOC reduction)
  • ✅ Codex agent simplified to ~110 LOC
  • ✅ Tests updated to stdin pattern
  • ✅ All agents have identical test structure

Refactor Codex agent to use direct CLI execution with stdin (matching Claude/Gemini pattern) instead of temp file I/O, reducing complexity and improving consistency.

Changes:
- Updated executeCommand() to use exec('codex', ['exec', '--skip-git-repo-check'], { input: prompt }) pattern
- Removed temp file logic: _readOutput(), _cleanupTempFile() methods (~50 LOC reduction)
- Codex agent now ~110 LOC (down from ~160 LOC)
- Updated tests to reflect new execution pattern (no temp file mocking)
- All three agents (Claude, Codex, Gemini) now have identical test structure

Acceptance Criteria Met:
- ✅ Codex uses stdin pattern (no temp files)
- ✅ File handling methods removed (~50 LOC reduction)
- ✅ Codex agent simplified to ~110 LOC
- ✅ Tests updated to stdin pattern
- ✅ All agents have identical test structure
arittr and others added 3 commits November 3, 2025 12:17
Remove dual-mode complexity by eliminating manual fallback mode. Add quiet flag for suppressing progress messages.

Changes:
- Removed --no-ai flag from CLI and schemas
- Added --quiet flag for suppressing stderr output
- Removed enableAI config option from generator
- Removed manual fallback methods from generator (~150 LOC)
- Generator always uses AI; agent failures throw errors
- Default behavior shows generating message in all modes
- Error messages provide clear installation instructions

Acceptance Criteria Met:
- ✅ --no-ai flag removed
- ✅ --quiet flag added and working
- ✅ enableAI config removed
- ✅ Manual fallback methods removed (~150 LOC)
- ✅ Generator always uses AI
- ✅ Default shows progress message
- ✅ Clear error messages
- ✅ Net ~200 LOC reduction

🤖 Generated with Claude via commitment

Co-Authored-By: Claude <noreply@anthropic.com>
Document breaking changes with clear migration instructions. Update all docs to reflect new AI-only architecture.

- Added CHANGELOG section for breaking changes and migration guide
- Provided installation links for Claude, Codex, Gemini CLIs
- Removed all --no-ai references from README
- Documented --quiet flag with examples
- Updated CLAUDE.md to reflect AI-only architecture and prompts module
- All documentation accurate and up-to-date

Acceptance Criteria Met:
- ✅ CHANGELOG documents breaking change with migration
- ✅ CHANGELOG includes CLI installation links
- ✅ README removes --no-ai references
- ✅ README documents --quiet flag
- ✅ CLAUDE.md reflects new architecture
- ✅ All documentation accurate
Fix mock pollution in shell integration tests by using cache-busting dynamic imports.

Issue: Agent unit tests use mock.module() which creates persistent module mocks. Even after mock.restore(), Bun's module cache still serves the mocked version to subsequent imports.

Solution:
- Dynamically import shell module in beforeAll hook after calling mock.restore()
- Add timestamp query parameter to force fresh import and bypass module cache
- Rename test file to zzz-shell.integration.test.ts for consistent ordering

Result: All 14 failing tests now pass
- Before: 568 pass, 14 fail
- After: 582 pass, 0 fail

Tests verified: bun test completes successfully
@arittr arittr merged commit f8849ec into 750f25-task-1-extract-prompt-module Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments