FIRE-665 | CICD | Fix rogue CI/CD + Documentation#23
Conversation
WalkthroughRefactored the GitHub Actions workflow to run the tshirt agent and Rogue server via consolidated inline bash (background processes with PID traps and curl readiness polling), switched judge model to Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant GH as GitHub Actions Runner
participant AG as tshirt_store_agent (localhost:10001)
participant RS as Rogue server (localhost:8000)
participant RG as Rogue CLI
participant OA as LLM (openai/o4-mini)
GH->>GH: Start agent in background (capture AGENT_PID, trap on EXIT)
GH->>AG: Poll `/.well-known/agent.json` until ready (curl & retries)
GH->>GH: Start Rogue server in background (capture ROGUE_PID, trap on EXIT)
GH->>RS: Poll `/api/v1/health` until ready (curl & retries)
GH->>RG: Run `uv run -m rogue cli --evaluated-agent-url http://localhost:10001 --judge-llm openai/o4-mini --workdir ...`
RG->>AG: Send evaluation requests
RG->>OA: Request judgments
OA-->>RG: Return judgments/results
RG-->>GH: Exit (artifacts/exit code)
GH->>AG: Trap: terminate agent (cleanup)
GH->>RS: Trap: terminate Rogue server
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks (3 passed)✅ Passed checks (3 passed)
Poem
✨ Finishing touches🧪 Generate unit tests
Comment |
…m:qualifire-dev/rogue-private into feature/gh-action-docs-and-fixes-FIRE-665
7c16d63 to
112d1dc
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
rogue/github_action_docs.md (3)
3-4: Minor grammar fix for clearer reading“a GitHub Actions” → “a GitHub Actions workflow”.
-In this example, we use a GitHub Actions, but this should be easily adaptable to other CI/CD platforms. +In this example, we use a GitHub Actions workflow, but this should be easily adaptable to other CI/CD platforms.
27-30: Tighten wording & remove provider-specific noiseListing every provider clutters the doc; a generic reminder is enough.
-2. Don't forget to set the llm provider api key (OPENAI_API_KEY / ANTHROPIC_API_KEY / GEMINI_API_KEY / gcloud login / etc) in your CI/CD environment. +2. Ensure the required LLM provider credentials (e.g., `OPENAI_API_KEY`) are available in your CI/CD environment.
84-86: Consider bounding the readiness wait to avoid endless loops
--retry 10 --retry-delay 5is good, but add--max-timeor--retry-max-timeso the step times out deterministically..github/workflows/rogue.yml (2)
34-42: Agent bind address vs readiness URLThe agent is started with
--host 0.0.0.0but probed onlocalhost. That works under normal Docker-less runners, yet in container-based jobslocalhostmay not map. Prefer probing the same address you bind.
46-49: Pass--exit-on-failure(or equivalent) so CI fails on Rogue test errorsIf Rogue’s CLI returns non-zero only on internal errors, consider an explicit flag that forces failure on unmet assertions to ensure the pipeline blocks regressions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (3)
.github/actions/rogue/action.yml(0 hunks).github/workflows/rogue.yml(2 hunks)rogue/github_action_docs.md(1 hunks)
💤 Files with no reviewable changes (1)
- .github/actions/rogue/action.yml
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
PR: qualifire-dev/qualifire#0
File: .cursor/rules/python.mdc:0-0
Timestamp: 2025-07-01T11:54:30.386Z
Learning: Use proper CI/CD practices
🔇 Additional comments (1)
.github/workflows/rogue.yml (1)
12-14: Timeout may be too aggressiveThe docs recommend 60 minutes; the workflow sets 15 minutes. Rogue runs can legitimately exceed 15 minutes, causing false CI failures. Align the two values or justify the shorter window.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
.github/workflows/rogue.yml (1)
39-45: Enable strict bash mode and ensure the agent process is reliably reaped (include wait in trap)Fail fast on errors/undef vars and guarantee the background agent is terminated and waited for to avoid orphans. This also addresses a prior suggestion.
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} run: | - echo "🚀 Starting AI agent..." + set -Eeuo pipefail + : "${OPENAI_API_KEY:?OPENAI_API_KEY secret is not set}" + echo "🚀 Starting AI agent..." uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & AGENT_PID=$! echo "Agent started with PID: $AGENT_PID" - trap 'echo "🛑 Stopping agent..."; kill $AGENT_PID' EXIT + trap 'echo "🛑 Stopping agent..."; kill -TERM "$AGENT_PID" 2>/dev/null || true; wait "$AGENT_PID" 2>/dev/null || true' INT TERM EXIT
🧹 Nitpick comments (3)
.github/workflows/rogue.yml (3)
46-49: Harden readiness probe for flakinessUse 127.0.0.1 (avoids IPv6 edge cases), show errors, and broaden retry criteria with modest timeouts.
- curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ - http://localhost:10001/.well-known/agent.json + curl --retry 18 --retry-delay 3 --retry-all-errors --connect-timeout 2 --max-time 5 -sS -f -o /dev/null \ + http://127.0.0.1:10001/.well-known/agent.json
51-55: Optional: add a final explicit wait for symmetryNot strictly required if the trap includes wait (above), but placing a final wait after Rogue exits makes teardown order explicit and keeps logs tidy.
--judge-llm-model openai/o4-mini \ --workdir './examples/tshirt_store_agent/.rogue' + + # Ensure background agent is fully reaped before step exits + wait "$AGENT_PID" 2>/dev/null || true
3-9: Tighten workflow security and ergonomics: least-privilege permissions + PR concurrencyHelps reduce token scope and avoids piling up old runs when pushing iteratively to a PR.
on: pull_request: push: branches: - main workflow_dispatch: +permissions: + contents: read + +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (2)
.github/workflows/rogue.yml(2 hunks)examples/tshirt_store_agent/tshirt_store_agent.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/tshirt_store_agent/tshirt_store_agent.py
🧰 Additional context used
📓 Path-based instructions (1)
.github/**
📄 CodeRabbit inference engine (AGENTS.md)
Keep CI/CD workflow files under .github/
Files:
.github/workflows/rogue.yml
🔇 Additional comments (1)
.github/workflows/rogue.yml (1)
11-13: Job rename looks goodThe new job id
run-rogueis clear and scoped to the single workflow. No issues here.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
.github/workflows/rogue.yml (1)
39-45: Enable strict bash mode and make trap reap the background agentAdd strict mode and a cleanup function that waits for the agent to exit; prevents orphaned/zombie processes and makes failures fail fast.
Apply:
- run: | - echo "🚀 Starting AI agent..." - uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & - AGENT_PID=$! - echo "Agent started with PID: $AGENT_PID" - trap 'echo "🛑 Stopping agent..."; kill $AGENT_PID' EXIT + run: | + set -Eeuo pipefail + echo "🚀 Starting AI agent..." + uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & + AGENT_PID=$! + echo "Agent started with PID: $AGENT_PID" + cleanup() { + echo "🛑 Stopping agent..." + if kill -0 "$AGENT_PID" 2>/dev/null; then + kill "$AGENT_PID" || true + wait "$AGENT_PID" 2>/dev/null || true + fi + } + trap cleanup EXIT
🧹 Nitpick comments (4)
.github/workflows/rogue.yml (4)
41-41: Bind to loopback, not 0.0.0.0No need to expose beyond the runner. Slightly safer and avoids IPv6/iptables surprises.
- uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & + uv run examples/tshirt_store_agent --host 127.0.0.1 --port 10001 &
47-49: Harden readiness probeBroaden retriable errors, cap total wait, and surface response bodies on failure for easier debugging.
- curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ - http://localhost:10001/.well-known/agent.json + curl --retry 12 --retry-delay 5 --retry-all-errors --retry-max-time 75 \ + -s --fail-with-body -o /dev/null \ + http://127.0.0.1:10001/.well-known/agent.json
51-55: Make judge model configurable via env with a sane defaultAvoid hardcoding the model; allow overriding without editing the workflow.
echo "🚀 Running rogue..." + JUDGE_LLM="${JUDGE_LLM:-openai/o4-mini}" uv run -m rogue cli \ --debug \ --evaluated-agent-url http://localhost:10001 \ - --judge-llm openai/o4-mini \ + --judge-llm "${JUDGE_LLM}" \ --workdir './examples/tshirt_store_agent/.rogue'
1-9: Tighten default permissions and cancel superseded runsSmall hardening + cost saver for busy branches.
Add near the top-level:
permissions: contents: read concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (1)
.github/workflows/rogue.yml(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
.github/**
📄 CodeRabbit inference engine (AGENTS.md)
Keep CI/CD workflow files under .github/
Files:
.github/workflows/rogue.yml
🔇 Additional comments (2)
.github/workflows/rogue.yml (2)
11-13: Rename + timeout look goodJob name and 15-minute cap are sensible for CI stability.
18-34: Add CI debugging step to verifyuvuses the.venvIn
.github/workflows/rogue.yml, immediately after the “Install rogue server” step, add:- name: Debug uv venv resolution run: | uv --version uv run python -c 'import sys; print(sys.executable)' uv run python -c 'import importlib; print(bool(importlib.util.find_spec("rogue")))'This will confirm that
uv runis pointing at the project’s.venvand thatrogueis importable.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
.github/workflows/rogue.yml (2)
39-41: Enable strict bash mode and activate the venv up front.Prevents silent failures/unset var issues and ensures uv uses the intended venv.
run: | + set -euo pipefail + source .venv/bin/activate echo "🚀 Starting AI agent..."
44-55: Second trap overrides the first; consolidate and wait to reap both PIDs.Using two EXIT traps means the second one replaces the first; also neither waits. Consolidate and wait to avoid orphans and flakiness.
- trap 'echo "🛑 Stopping agent..."; kill $AGENT_PID' EXIT + # single trap for both; terminate and wait + cleanup() { + echo "🛑 Stopping processes..." + kill -TERM "$ROGUE_PID" "$AGENT_PID" 2>/dev/null || true + wait "$ROGUE_PID" "$AGENT_PID" 2>/dev/null || true + } + trap cleanup EXIT INT TERM ... - trap 'echo "🛑 Stopping rogue server..."; kill $ROGUE_PID' EXIT + # (trap already set above)
🧹 Nitpick comments (2)
.github/workflows/rogue.yml (2)
41-41: Bind to localhost in CI to reduce exposure surface.No need for 0.0.0.0 on GitHub runners; localhost is sufficient.
- uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & + uv run examples/tshirt_store_agent --host 127.0.0.1 --port 10001 & ... - uv run -m rogue server --host 0.0.0.0 --port 8000 & + uv run -m rogue server --host 127.0.0.1 --port 8000 &Also applies to: 51-51, 64-66
35-38: Skip the step cleanly when OPENAI_API_KEY is unavailable (e.g., forked PRs).Prevents red CI on external PRs while keeping security posture intact.
- - name: Run rogue + - name: Run rogue + if: ${{ secrets.OPENAI_API_KEY != '' }} shell: bash env: OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (1)
.github/workflows/rogue.yml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
.github/**
📄 CodeRabbit inference engine (AGENTS.md)
Keep CI/CD workflow files under .github/
Files:
.github/workflows/rogue.yml
🔇 Additional comments (1)
.github/workflows/rogue.yml (1)
39-66: The workflow file was not found in the sandbox environment, so I can’t confirm how Python anduvare set up in that CI job. Please verify in your repository that the GitHub Actions job:
- Creates or activates the intended virtual environment (e.g. runs
python -m venv .venv/pipenv shell/poetry install) before invokinguv run.- Uses the correct interpreter (print
which pythonandwhich uvjust beforeuv run) to ensure editable installs are picked up.Add these checks or activate the venv explicitly in the workflow.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
rogue/github_action_docs.md (1)
80-116: Single trap is overwritten; leaked background processes + missing strict mode.Two EXIT traps (Lines 88–90 and 101–102) override each other, so only the Rogue server gets cleaned up; the agent can be orphaned. Add strict mode, consolidate cleanup into one trap, tighten curl readiness probes, and prefer localhost binding.
Apply:
- run: | - echo "🚀 Starting AI agent..." + run: | + set -Eeuo pipefail + IFS=$'\n\t' + echo "🚀 Starting AI agent..." # Command to start your specific AI agent # ----> Notice! Replace with your own agent start command <---- - uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & + uv run examples/tshirt_store_agent --host 127.0.0.1 --port 10001 & AGENT_PID=$! echo "Agent started with PID: $AGENT_PID" - # This trap ensures the agent process is killed when the script exits, even if an error occurs. - trap 'echo "🛑 Stopping agent..."; kill $AGENT_PID' EXIT + cleanup() { + echo "🛑 Cleaning up..." + if [[ -n "${ROGUE_PID:-}" ]]; then kill -TERM "$ROGUE_PID" 2>/dev/null || true; fi + if [[ -n "${AGENT_PID:-}" ]]; then kill -TERM "$AGENT_PID" 2>/dev/null || true; fi + # Wait for any remaining background jobs to finish or be reaped + wait 2>/dev/null || true + } + trap cleanup EXIT echo "⏳ Waiting for agent to be ready..." # ----> Update this URL if needed. <---- - curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ + curl --retry 20 --retry-delay 3 --retry-connrefused --connect-timeout 2 --max-time 120 -s --fail -o /dev/null \ http://localhost:10001/.well-known/agent.json echo "🚀 Starting rogue server..." # ----> Rogue server runs on port 8000 by default <---- - uv run -m rogue server --host 0.0.0.0 --port 8000 & + uv run -m rogue server --host 127.0.0.1 --port 8000 & ROGUE_PID=$! echo "Rogue server started with PID: $ROGUE_PID" - trap 'echo "🛑 Stopping rogue server..."; kill $ROGUE_PID' EXIT echo "⏳ Waiting for rogue server to be ready..." # ----> Update this URL if needed. <---- - curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ + curl --retry 20 --retry-delay 3 --retry-connrefused --connect-timeout 2 --max-time 120 -s --fail -o /dev/null \ http://localhost:8000/api/v1/health echo "🚀 Rogue server ready" echo "🚀 Running rogue..." # Rogue CLI command. # ----> Adjust the cli arguments based on your needs. See the CLI documentation for more details. <---- uv run -m rogue cli \ --evaluated-agent-url http://localhost:10001 \ --judge-llm openai/o4-mini \ --workdir './examples/tshirt_store_agent/.rogue' + + echo "✅ Rogue finished – shutting down services" + cleanup + trap - EXIT
🧹 Nitpick comments (4)
rogue/github_action_docs.md (4)
67-75: Streamline uv usage; activation is unnecessary with uv and installs can be combined.Reduce steps and speed up by letting uv target the project venv directly.
- - name: Create venv - run: uv venv - - - name: Install rogue sdk - run: source .venv/bin/activate && uv pip install -e sdks/python - - - name: Install rogue server - run: source .venv/bin/activate && uv pip install -e . + - name: Prepare env and install + run: | + uv venv + uv pip install -e sdks/python -e .
43-55: Guard secrets on forked PRs; limit when the workflow runs with secrets.This job requires OPENAI_API_KEY, which isn’t exposed to forked PRs. Add a job-level condition to skip on forks (or use pull_request_target with care).
jobs: run-rogue: runs-on: ubuntu-latest + if: ${{ github.event_name != 'pull_request' || github.event.pull_request.head.repo.fork == false }} timeout-minutes: 15
30-34: Fix markdownlint MD026: remove trailing colon from heading.Minor lint clean-up.
-## Notes: +## Notes
3-4: Tighten wording.Minor grammar/clarity improvement.
-You can use Rogue as part of your CI/CD pipeline to test your AI agent's functionality. \ -In this example, we use a GitHub Actions, but this should be easily adaptable to other CI/CD platforms. +You can use Rogue as part of your CI/CD pipeline to test your AI agent's functionality. \ +In this example, we use a GitHub Actions workflow, but this should be easily adaptable to other CI/CD platforms.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
rogue/github_action_docs.md(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
rogue/github_action_docs.md
30-30: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: codestyle
- GitHub Check: rogue_sanity
🔇 Additional comments (1)
rogue/github_action_docs.md (1)
112-115: Verified: --judge-llm is the canonical CLI flagCLI defines --judge-llm in rogue/run_cli.py (lines 60–63). The docs snippet is correct — replace any occurrences of --judge-llm-model with --judge-llm.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
.github/workflows/rogue.yml (1)
39-66: Consolidate traps, add strict bash mode, and ensure clean shutdown; also activate the venv used for installsTwo EXIT traps override each other, so only the last one runs; background processes may be left running and not reaped. Add strict mode, a single cleanup trap that kills and waits for both PIDs, add curl timeouts, and source the created venv so uv run uses the same environment.
Apply:
@@ - run: | - echo "🚀 Starting AI agent..." - uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & + run: | + set -euo pipefail + IFS=$'\n\t' + # Ensure we use the same venv where deps were installed + source .venv/bin/activate + + cleanup() { + local ec=$? + echo "🧹 Cleanup (exit code: $ec)" + set +e + if [[ -n "${ROGUE_PID:-}" ]] && kill -0 "$ROGUE_PID" 2>/dev/null; then + echo "🛑 Stopping rogue server..." + kill "$ROGUE_PID" + wait "$ROGUE_PID" || true + fi + if [[ -n "${AGENT_PID:-}" ]] && kill -0 "$AGENT_PID" 2>/dev/null; then + echo "🛑 Stopping agent..." + kill "$AGENT_PID" + wait "$AGENT_PID" || true + fi + exit $ec + } + trap cleanup EXIT INT TERM + + echo "🚀 Starting AI agent..." + uv run examples/tshirt_store_agent --host 0.0.0.0 --port 10001 & AGENT_PID=$! echo "Agent started with PID: $AGENT_PID" - trap 'echo "🛑 Stopping agent..."; kill $AGENT_PID' EXIT echo "⏳ Waiting for agent to be ready..." - curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ + curl --retry 10 --retry-delay 5 --retry-connrefused --connect-timeout 2 --max-time 5 -s --fail -o /dev/null \ http://localhost:10001/.well-known/agent.json echo "🚀 Starting rogue server..." uv run -m rogue server --host 0.0.0.0 --port 8000 & ROGUE_PID=$! echo "Rogue server started with PID: $ROGUE_PID" - trap 'echo "🛑 Stopping rogue server..."; kill $ROGUE_PID' EXIT echo "⏳ Waiting for rogue server to be ready..." - curl --retry 10 --retry-delay 5 --retry-connrefused -s --fail -o /dev/null \ + curl --retry 10 --retry-delay 5 --retry-connrefused --connect-timeout 2 --max-time 5 -s --fail -o /dev/null \ http://localhost:8000/api/v1/health echo "🚀 Rogue server ready" - + echo "🚀 Running rogue..." uv run -m rogue cli \ --evaluated-agent-url http://localhost:10001 \ --judge-llm openai/o4-mini \ --workdir './examples/tshirt_store_agent/.rogue'
🧹 Nitpick comments (1)
.github/workflows/rogue.yml (1)
60-60: Fix trailing whitespace to satisfy yamllintRemove the trailing-spaces-only line after the “Rogue server ready” echo.
- +
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/rogue.yml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
.github/**
📄 CodeRabbit inference engine (AGENTS.md)
Keep CI/CD workflow files under .github/
Files:
.github/workflows/rogue.yml
🪛 YAMLlint (1.37.1)
.github/workflows/rogue.yml
[error] 60-60: trailing spaces
(trailing-spaces)
🔇 Additional comments (1)
.github/workflows/rogue.yml (1)
64-65: Verified — openai/o4-mini is a valid --judge-llm valueRogue CLI accepts LiteLLM-format model identifiers and lists openai/o4-mini as supported.
File: .github/workflows/rogue.yml (lines 64–65)
--judge-llm openai/o4-mini \ --workdir './examples/tshirt_store_agent/.rogue'
Summary by CodeRabbit
New Features
Documentation
Chores
Refactor