Skip to content

ci: enable agentic provider live tests (claude-code, codex, gemini-cli)#7088

Merged
codefromthecrypt merged 1 commit intomainfrom
ci-agentic-providers
Feb 10, 2026
Merged

ci: enable agentic provider live tests (claude-code, codex, gemini-cli)#7088
codefromthecrypt merged 1 commit intomainfrom
ci-agentic-providers

Conversation

@codefromthecrypt
Copy link
Collaborator

@codefromthecrypt codefromthecrypt commented Feb 9, 2026

Summary

Enable agentic provider live tests in CI. These providers were conditionally included in test_providers.sh but always skipped because the CLI tools weren't installed.

This sets up claude-code, codex, gemini-cli because we have credentials for them already. This doesn't do cursor-agent as it needs authentication we don't already have. Someone could follow-up.

Here are the main changes

  • Install @anthropic-ai/claude-code, @openai/codex, @google/gemini-cli via npm in both smoke test jobs
  • Map exiting creds to CODEX_API_KEY and GEMINI_API_KEY env vars
  • Agentic providers use a file-read prompt with known content marker (test-content-abc123) instead of the shell/ls prompt, since they handle tools internally and can't produce shell | developer log patterns
  • Skip agentic providers in --code-exec mode, as we currently don't export our extensions to agents as MCP.

Type of Change

  • Tests
  • Build / Release

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

Verified all three agentic providers pass in CI: https://github.com/block/goose/actions/runs/21844636599

2026-02-09T23:20:41.0261178Z ✓ claude-code: claude-sonnet-4-20250514                                                                                                              
2026-02-09T23:20:41.0261740Z ✓ codex: gpt-5.2-codex                                                                                                                               
2026-02-09T23:20:41.0262283Z ✓ gemini-cli: gemini-2.5-pro   

Install CLI tools via npm in smoke test jobs and add CODEX_API_KEY and
GEMINI_API_KEY env var mappings. Agentic providers use a file-read prompt
with a known content marker instead of the shell/ls prompt, since they
handle tools internally. Skip agentic providers in --code-exec mode.

Signed-off-by: Adrian Cole <adrian@tetrate.io>
@codefromthecrypt codefromthecrypt marked this pull request as ready for review February 9, 2026 23:45
Copilot AI review requested due to automatic review settings February 9, 2026 23:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables live CI coverage for agentic CLI-based providers by installing their CLIs in smoke-test jobs and adapting the provider test script to validate agentic behavior via a deterministic file-read prompt instead of tool-call log patterns.

Changes:

  • Install @anthropic-ai/claude-code, @openai/codex, and @google/gemini-cli in PR smoke-test workflows and map existing secrets to CODEX_API_KEY / GEMINI_API_KEY.
  • Update scripts/test_providers.sh to detect agentic providers and validate success by checking for known file content output.
  • Skip agentic providers when running in --code-exec mode.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scripts/test_providers.sh Adds agentic-provider detection and switches agentic verification to a file-read/content-marker check; skips agentic providers in code-exec mode.
.github/workflows/pr-smoke-test.yml Installs Node + agentic provider CLIs in smoke-test jobs and exports additional env vars for Codex/Gemini CLIs.

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to have these - not sure if copilot feedback is helpful or gaslighting in this case.

@codefromthecrypt
Copy link
Collaborator Author

yeah copilot was pretty horrible this time. complete hallucination!

@codefromthecrypt codefromthecrypt added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 3a304c6 Feb 10, 2026
25 checks passed
@codefromthecrypt codefromthecrypt deleted the ci-agentic-providers branch February 10, 2026 03:06
zanesq added a commit that referenced this pull request Feb 10, 2026
…tensions-deeplinks

* 'main' of github.com:block/goose:
  [docs] update authors.yaml file (#7114)
  Implement manpage generation for goose-cli (#6980)
  docs: tool output optimization (#7109)
  Fix duplicated output in Code Mode by filtering content by audience (#7117)
  Enable tom (Top Of Mind) platform extension by default (#7111)
  chore: added notification for canary build failure (#7106)
  fix: fix windows bundle random failure and optimise canary build (#7105)
  feat(acp): add model selection support for session/new and session/set_model (#7112)
  fix: isolate claude-code sessions via stream-json session_id (#7108)
  ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088)
  docs: codex subscription support (#7104)
  chore: add a new scenario (#7107)
  fix: Goose Desktop missing Calendar and Reminders entitlements (#7100)
  Fix 'Edit In Place' and 'Fork Session' features (#6970)
  Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082)
  Docs: require auth optional for custom providers (#7098)
  fix: improve text-muted contrast for better readability (#7095)
  Always sync bundled extensions (#7057)
tlongwell-block added a commit that referenced this pull request Feb 10, 2026
* origin/main:
  feat: add AGENT=goose environment variable for cross-tool compatibility (#7017)
  fix: strip empty extensions array when deeplink also (#7096)
  [docs] update authors.yaml file (#7114)
  Implement manpage generation for goose-cli (#6980)
  docs: tool output optimization (#7109)
  Fix duplicated output in Code Mode by filtering content by audience (#7117)
  Enable tom (Top Of Mind) platform extension by default (#7111)
  chore: added notification for canary build failure (#7106)
  fix: fix windows bundle random failure and optimise canary build (#7105)
  feat(acp): add model selection support for session/new and session/set_model (#7112)
  fix: isolate claude-code sessions via stream-json session_id (#7108)
  ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088)
  docs: codex subscription support (#7104)
  chore: add a new scenario (#7107)
  fix: Goose Desktop missing Calendar and Reminders entitlements (#7100)
  Fix 'Edit In Place' and 'Fork Session' features (#6970)
  Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082)

# Conflicts:
#	crates/goose/src/agents/extension.rs
jh-block added a commit that referenced this pull request Feb 10, 2026
* origin/main: (30 commits)
  docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125)
  feat: replace subagent and skills with unified summon extension (#6964)
  feat: add AGENT=goose environment variable for cross-tool compatibility (#7017)
  fix: strip empty extensions array when deeplink also (#7096)
  [docs] update authors.yaml file (#7114)
  Implement manpage generation for goose-cli (#6980)
  docs: tool output optimization (#7109)
  Fix duplicated output in Code Mode by filtering content by audience (#7117)
  Enable tom (Top Of Mind) platform extension by default (#7111)
  chore: added notification for canary build failure (#7106)
  fix: fix windows bundle random failure and optimise canary build (#7105)
  feat(acp): add model selection support for session/new and session/set_model (#7112)
  fix: isolate claude-code sessions via stream-json session_id (#7108)
  ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088)
  docs: codex subscription support (#7104)
  chore: add a new scenario (#7107)
  fix: Goose Desktop missing Calendar and Reminders entitlements (#7100)
  Fix 'Edit In Place' and 'Fork Session' features (#6970)
  Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082)
  Docs: require auth optional for custom providers (#7098)
  ...
Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026
…i) (block#7088)

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026
…i) (block#7088)

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments