Skip to content

FIRE-812 | Rogue-TUI | MCP Support#115

Merged
yuval-qf merged 4 commits intomainfrom
feature/FIRE-812-mcp-tui
Oct 23, 2025
Merged

FIRE-812 | Rogue-TUI | MCP Support#115
yuval-qf merged 4 commits intomainfrom
feature/FIRE-812-mcp-tui

Conversation

@yuval-qf
Copy link
Collaborator

@yuval-qf yuval-qf commented Oct 22, 2025

Description

Motivation and Context

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • 🎨 Code style/refactoring (no functional changes)
  • 🧪 Test updates
  • 🔧 Configuration/build changes

Changes Made

Screenshots/Examples (if applicable)

Checklist

  • I have read the CONTRIBUTING.md guide
  • My code follows the code style of this project (PEP 8, type hints, docstrings)
  • I have run uv run black . to format my code
  • I have run uv run flake8 . and fixed all issues
  • I have run uv run mypy --config-file .mypy.ini . and addressed type checking issues
  • I have run uv run bandit -c .bandit.yaml -r . for security checks
  • I have added tests that prove my fix is effective or that my feature works
  • I have run uv run pytest and all tests pass
  • I have manually tested my changes
  • I have updated the documentation accordingly
  • I have added/updated type hints for new/modified functions
  • My changes generate no new warnings
  • I have checked my code for security issues
  • Any dependent changes have been merged and published

Testing

Test Configuration:

  • Python version:
  • OS:
  • Other relevant details:

Test Steps:
1.
2.
3.

Additional Notes

Related Issues/PRs

  • Fixes #
  • Related to #
  • Depends on #

@yuval-qf yuval-qf self-assigned this Oct 22, 2025
@yuval-qf yuval-qf requested a review from drorIvry as a code owner October 22, 2025 14:21
@yuval-qf yuval-qf had a problem deploying to rogue-sanity-ci-secrets October 22, 2025 14:21 — with GitHub Actions Failure
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 22, 2025

Summary by CodeRabbit

  • New Features
    • Added Protocol and Transport configuration options to the evaluation form
    • Added Deep Test toggle field with visual checkmark indicator when enabled
    • Enhanced form field styling with improved visual feedback for active fields and clearer navigation guidance

Walkthrough

The pull request extends the TUI evaluation form from four to six fields by introducing Protocol and Transport configuration dropdowns, adding specialized field renderers for text inputs, dropdowns, and toggles, and updating navigation and state management throughout the form controller layer to accommodate the expanded field layout.

Changes

Cohort / File(s) Summary
TUI Configuration & Debugging
\.gitignore, \.vscode/launch\.json
Adds Go TUI debugging artifact ignore rule and new "Rogue TUI" launch configuration for VSCode debugging of the main.go entry point.
TUI Form Rendering
packages/tui/internal/tui/eval_form\.go
Replaces single field-render helper with three specialized renderers (renderTextField, renderDropdownField, renderToggleField); adds Protocol, Transport, and Deep Test field support; introduces inline cursor visualization and enhanced active-field styling; expands form to six fields with updated button positioning.
TUI State & Types
packages/tui/internal/tui/eval_ui\.go, packages/tui/internal/tui/evaluation\.go
Introduces Protocol and Transport types with constants; extends EvaluationViewState with AgentProtocol and AgentTransport fields; updates StartEvaluation signature to accept protocol/transport parameters; adds enumeration and cycling helpers for protocol/transport navigation.
TUI Form Navigation & Input
packages/tui/internal/tui/common_controller\.go, packages/tui/internal/tui/eval_form_controller\.go, packages/tui/internal/tui/keyboard_controller\.go
Remaps field indexing from 0–3 to 0–5; updates up/down navigation to cycle through all six fields; refactors left/right movement to support dropdown cycling for Protocol/Transport; remaps tab, space, and backspace handlers to new field positions; initializes AgentProtocol and AgentTransport in evaluation state; adjusts paste handling to field-based logic.
Backend & CLI
rogue/evaluator_agent/base_evaluator_agent\.py, examples/mcp/tshirt_store_langgraph_mcp/__main__\.py
Updates _log_evaluation signature to accept \*\*kwargs for flexible logging context; refactors CLI transport option to use click.Choice for centralized validation.

Sequence Diagram(s)

sequenceDiagram
    participant User as User Input
    participant Form as Eval Form
    participant State as EvalViewState
    participant Renderer as Form Renderer

    User->>Form: Navigate Up/Down
    Form->>Form: Cycle currentField 0–5
    Form->>State: Update currentField

    User->>Form: Navigate Left/Right on Dropdown
    rect rgb(220, 240, 255)
        Note over Form,State: Protocol/Transport Fields
        Form->>Form: cycleProtocol() or cycleTransport()
        Form->>State: Update AgentProtocol/AgentTransport
    end

    User->>Form: Type in Text Field
    rect rgb(240, 220, 255)
        Note over Form,State: AgentURL/JudgeModel Fields
        Form->>Form: Append char to field text
        Form->>State: Update cursorPos
    end

    User->>Form: Space on Field 4
    rect rgb(255, 240, 220)
        Note over Form,State: Toggle Deep Test
        Form->>Form: Toggle DeepTest flag
        Form->>State: Update DeepTest
    end

    State->>Renderer: Render form with updated state
    Renderer->>User: Display six fields with active styling
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The diff spans multiple interrelated files with significant navigation and control-flow changes. Field indexing is remapped across three controller files, new types and helpers are introduced, form rendering logic is specialized, and state initialization is extended. While individual changes follow consistent patterns, the interconnected nature of form navigation, field routing, and index management across multiple files requires careful verification of correctness at each layer.

Possibly related PRs

  • hotfix/paste agent url #102: Modifies paste handling and cursor placement for form fields on NewEvaluationScreen, directly related to changes in common_controller.go and keyboard_controller.go.
  • FIRE-813 | Gradio UI | MCP Support #110: Adds Protocol and Transport support to the Gradio UI, parallel feature implementation across different UI frameworks.
  • FIRE-751 - report to BFF #37: Updates evaluation.go Model.StartEvaluation signature and EvaluationRequest fields, overlapping with the evaluation pipeline changes.

Suggested reviewers

  • osher-qualifire
  • drorIvry

Poem

🐰 Six fields now dance where once were four,
With Protocol and Transport at the door,
Dropdowns spin and text fields glow,
The form expands—watch the cursor flow! ✨
A toggle for testing deep and true,
Navigation reborn—see what TUI can do! 🎭

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description is largely incomplete and fails to meet the template requirements. While the author correctly marked "New feature" as the type of change and checked all checklist items, critical sections remain entirely blank: the Description section contains only placeholder comments with no actual content, Motivation and Context is empty, Changes Made lists no specific modifications, Testing provides no test configuration or steps, and Related Issues/PRs lacks any actual issue links. The description appears to be a template submission with checklist items marked but without the substantive information needed to understand the PR's purpose, scope, or how it was tested.
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "FIRE-812 | Rogue-TUI | MCP Support" is clearly related to the main changes in the changeset. The modifications across multiple TUI files introduce Protocol and Transport configuration fields, which are MCP (Model Context Protocol) concepts, directly supporting the stated goal of adding MCP support. The title is concise, includes the ticket reference, and avoids vague terminology. While the title could be more specific about the Protocol/Transport configuration aspect of the implementation, it successfully communicates the primary intent and a teammate scanning the history would understand this PR adds MCP capabilities to the Rogue TUI.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/FIRE-812-mcp-tui

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
packages/tui/internal/tui/eval_form.go (1)

61-67: Clamp cursor position to avoid negative slice panics

If m.evalState.cursorPos becomes negative, runes[:cursorPos] panics. Clamp to [0,len].

-            cursorPos := m.evalState.cursorPos
+            cursorPos := m.evalState.cursorPos
+            if cursorPos < 0 {
+                cursorPos = 0
+            }
             if cursorPos > len(runes) {
                 cursorPos = len(runes)
             }
packages/tui/internal/tui/evaluation.go (1)

27-32: ScenarioType "custom" will not be accepted by the server

Python SDK/Server supports POLICY and PROMPT_INJECTION. Sending "custom" will fail validation.

 type ScenarioType string
 
 const (
     ScenarioTypePolicy ScenarioType = "policy"
-    ScenarioTypeCustom ScenarioType = "custom"
+    ScenarioTypePromptInjection ScenarioType = "prompt_injection"
 )

Optionally map any legacy "custom" to "policy" or "prompt_injection" in your UI before posting.

packages/tui/internal/tui/app.go (1)

968-985: JudgeModel "/" input not handled due to wrong condition.

Condition restricts to currentField <= 1, so case 3 (JudgeModel) never executes.

Apply:

-            if m.currentScreen == NewEvaluationScreen && m.evalState != nil && m.evalState.currentField <= 1 {
+            if m.currentScreen == NewEvaluationScreen && m.evalState != nil &&
+               (m.evalState.currentField == 0 || m.evalState.currentField == 3) {
sdks/python/rogue_sdk/sdk.py (1)

239-248: Pass agent_url as string; Pydantic will auto-coerce to HttpUrl. Add validation error handling for clearer SDK exceptions.

The code at line 258 currently constructs HttpUrl directly: evaluated_agent_url=HttpUrl(agent_url),. This should be changed to pass the string directly, letting Pydantic handle the coercion. This approach is cleaner, avoids version brittleness concerns, and prevents mypy/flake8 complications.

Apply this change at line 258:

-            evaluated_agent_url=HttpUrl(agent_url),
+            evaluated_agent_url=agent_url,

Consider wrapping the AgentConfig construction (lines 254-269) in try/except to provide clearer SDK exceptions:

+        try:
             agent_config = AgentConfig(
                 protocol=protocol,
                 transport=transport,
                 evaluated_agent_url=agent_url,
                 evaluated_agent_auth_type=auth_type,
                 evaluated_agent_credentials=auth_credentials,
                 judge_llm=judge_model,
                 judge_llm_api_key=judge_llm_api_key,
                 deep_test_mode=deep_test,
                 interview_mode=True,
                 parallel_runs=1,
                 business_context=business_context,
             )
+        except ValueError as e:
+            raise ValueError(f"Invalid agent configuration: {e}") from e
rogue/evaluator_agent/protocols/base_evaluator_agent.py (2)

109-116: Tool signature mismatch: docs vs actual function order (can break tool calls).

Docs advertise _send_message_to_evaluated_agent(message: str, context_id: str) while the actual signature is (context_id: str, message: str). This will confuse the LLM/tooling and can swap args. Align the docs with the function (or vice‑versa), but keep it consistent everywhere (including A2A/MCP classes).

Apply this doc fix to match the current code signature:

-2. `_send_message_to_evaluated_agent(message: str, context_id: str)`
+2. `_send_message_to_evaluated_agent(context_id: str, message: str)`
@@
-- `message`: The text to send to the other agent
-- `context_id`: The context ID for this conversation
+- `context_id`: The context ID for this conversation
+- `message`: The text to send to the other agent

As per coding guidelines.

Also applies to: 482-499


334-343: Guard judge LLM call with try/except to prevent hard failures.

evaluate_policy(...) can raise (network, auth, provider errors). Wrap it and return a safe failure with a reason to keep the run resilient.

-            policy_evaluation_result = evaluate_policy(
-                conversation=conversation,
-                policy=scenario.scenario,
-                model=self._judge_llm,
-                business_context=self._business_context,
-                expected_outcome=scenario.expected_outcome,
-                api_key=self._judge_llm_auth,
-            )
+            try:
+                policy_evaluation_result = evaluate_policy(
+                    conversation=conversation,
+                    policy=scenario.scenario,
+                    model=self._judge_llm,
+                    business_context=self._business_context,
+                    expected_outcome=scenario.expected_outcome,
+                    api_key=self._judge_llm_auth,
+                )
+            except Exception as e:
+                logger.exception("❌ Policy evaluation failed", extra={"scenario": scenario.scenario})
+                return False, f"Policy evaluation error: {type(e).__name__}: {e}"

As per coding guidelines.

🧹 Nitpick comments (14)
.gitignore (1)

93-96: Minor: Reduce excessive blank line spacing.

Lines 93–94 contain two consecutive blank lines for separation. Consider reducing to a single blank line to maintain consistency with the rest of the file's formatting conventions.

 bin/
-
-
 # Go TUI
 packages/tui/cmd/rogue/__debug*

The ignore pattern itself for Go TUI debug artifacts (packages/tui/cmd/rogue/__debug*) is appropriate and well-scoped to the TUI package.

examples/mcp/tshirt_store_langgraph_mcp/shirtify_agent.py (1)

68-74: Consider adding error handling for agent initialization.

The agent creation could fail due to invalid model names, missing API keys, or network issues. Per the coding guidelines, consider wrapping exception-prone code in try/except blocks.

 def __init__(self, model: str = "openai:gpt-5-mini") -> None:
     self.memory = MemorySaver()
 
+    try:
         self.graph: CompiledStateGraph = create_agent(
             model=model,
             system_prompt=AGENT_INSTRUCTIONS,
             tools=[ShirtifyAgent._inventory_tool, ShirtifyAgent._send_email_tool],
             response_format=ResponseFormat,
             checkpointer=self.memory,
         )
+    except Exception as e:
+        raise RuntimeError(f"Failed to initialize ShirtifyAgent: {e}") from e
rogue/evaluator_agent/evaluator_agent_factory.py (1)

37-50: MCP branch wiring is correct; consider defaulting transport in signature

Creation passes through all required params; MCPEvaluatorAgent handles None by defaulting to STREAMABLE_HTTP. To make this explicit at the call site, set a default in the function signature.

-def get_evaluator_agent(
-    protocol: Protocol,
-    transport: Transport | None,
+def get_evaluator_agent(
+    protocol: Protocol,
+    transport: Transport | None = None,

As per coding guidelines.

packages/tui/internal/tui/evaluation.go (3)

21-25: AuthType constant mismatch with server ("bearer_token")

Server enum is bearer_token. Align now to avoid future auth breakage when you add auth.

 const (
     AuthTypeNoAuth AuthType = "no_auth"
-    AuthTypeBearer AuthType = "bearer"
+    AuthTypeBearer AuthType = "bearer_token"
     AuthTypeBasic  AuthType = "basic"
 )

435-441: Strengthen URL validation (scheme/host) to prevent later WS/HTTP failures

url.Parse accepts many invalid inputs. Enforce http/https and non-empty host.

-    if _, err := url.Parse(serverURL); err != nil {
+    if u, err := url.Parse(serverURL); err != nil || (u.Scheme != "http" && u.Scheme != "https") || u.Host == "" {
         return nil, nil, fmt.Errorf("invalid server url: %w", err)
     }
-    if _, err := url.Parse(agentURL); err != nil {
+    if u, err := url.Parse(agentURL); err != nil || (u.Scheme != "http" && u.Scheme != "https") || u.Host == "" {
         return nil, nil, fmt.Errorf("invalid agent url: %w", err)
     }

209-216: Build WS URL via net/url instead of string replace

String replace can corrupt URLs (ports, prefixes, embedded substrings). Recompose safely.

-    wsURL := strings.Replace(sdk.baseURL, "http://", "ws://", 1)
-    wsURL = strings.Replace(wsURL, "https://", "wss://", 1)
-    wsURL += "/api/v1/ws/" + url.PathEscape(jobID)
+    u, err := url.Parse(sdk.baseURL)
+    if err != nil {
+        return fmt.Errorf("invalid base url: %w", err)
+    }
+    switch u.Scheme {
+    case "http":
+        u.Scheme = "ws"
+    case "https":
+        u.Scheme = "wss"
+    default:
+        return fmt.Errorf("unsupported scheme for websocket: %s", u.Scheme)
+    }
+    u.Path = strings.TrimRight(u.Path, "/") + "/api/v1/ws/" + url.PathEscape(jobID)
+    wsURL := u.String()
rogue/server/core/evaluation_orchestrator.py (1)

58-66: Log protocol/transport for observability

Include protocol and transport in the starting log to ease debugging mismatches.

         self.logger.info(
             "🎯 EvaluationOrchestrator starting evaluation",
             extra={
                 "scenario_count": len(self.scenarios.scenarios),
                 "agent_url": self.evaluated_agent_url,
                 "judge_llm": self.judge_llm,
                 "deep_test_mode": self.deep_test_mode,
+                "protocol": self.protocol.value if hasattr(self.protocol, "value") else str(self.protocol),
+                "transport": (self.transport.value if hasattr(self.transport, "value") else str(self.transport)) if self.transport else None,
             },
         )
packages/tui/internal/tui/app.go (2)

1250-1274: Dead code: references to ParallelRuns at field index 6.

UI defines fields 0–5; index 6 is unreachable. Remove or implement a field for ParallelRuns to keep indices consistent.

Apply:

-                    case 6: // ParallelRuns (special handling for numbers)
-                        if m.evalState.ParallelRuns >= 10 {
-                            m.evalState.ParallelRuns /= 10
-                            m.evalState.cursorPos--
-                        } else if m.evalState.ParallelRuns > 0 {
-                            m.evalState.ParallelRuns = 1 // Don't allow 0
-                            m.evalState.cursorPos = 0
-                        }

and

-                    case 6: // ParallelRuns (numeric only)
-                        if s[0] >= '0' && s[0] <= '9' {
-                            numStr := fmt.Sprintf("%d", m.evalState.ParallelRuns)
-                            runes := []rune(numStr)
-                            newNumStr := string(runes[:m.evalState.cursorPos]) + s + string(runes[m.evalState.cursorPos:])
-                            m.evalState.ParallelRuns = clampToInt(newNumStr)
-                            m.evalState.cursorPos++
-                        }

Also applies to: 1280-1297


1220-1234: Use rune-aware length for cursor movement.

Cursor is tracked as rune index; use len([]rune(...)) to avoid off-by with non-ASCII.

Apply:

-                case 0: // AgentURL text field
-                    fieldLen := len(m.evalState.AgentURL)
+                case 0: // AgentURL text field
+                    fieldLen := len([]rune(m.evalState.AgentURL))
...
-                case 3: // JudgeModel text field
-                    fieldLen := len(m.evalState.JudgeModel)
+                case 3: // JudgeModel text field
+                    fieldLen := len([]rune(m.evalState.JudgeModel))
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1)

67-108: Minor: reduce debug verbosity to avoid large logs.

_get_text_from_response logs full response; consider lowering to trace or truncating.

rogue/evaluator_agent/protocols/base_evaluator_agent.py (2)

533-537: Don’t let chat_update_callback exceptions break evaluation.

A user‑provided callback can throw and abort the run. Guard it.

-        if self._chat_update_callback:
-            self._chat_update_callback(
-                {"role": callback_role, "content": message},
-            )
+        if self._chat_update_callback:
+            try:
+                self._chat_update_callback({"role": callback_role, "content": message})
+            except Exception:
+                logger.exception("chat_update_callback failed", extra={"context_id": context_id})

As per coding guidelines.


353-361: scenario_type param in _log_evaluation is unused (other than logging).

Either use it to influence parsing/validation or drop it to reduce surface area. Keeping only **kwargs (for forward‑compat) is fine.

As per coding guidelines.

rogue/evaluator_agent/run_evaluator_agent.py (2)

248-248: Avoid asyncio.run when already in an event loop.

Calling this from an async context will raise “asyncio.run() cannot be called from a running event loop”. Add a guard with a helpful error.

-    return asyncio.run(run_evaluator_agent_task())
+    try:
+        asyncio.get_running_loop()
+        in_loop = True
+    except RuntimeError:
+        in_loop = False
+    if in_loop:
+        raise RuntimeError(
+            "run_evaluator_agent() called from within an event loop. "
+            "Use arun_evaluator_agent() instead."
+        )
+    return asyncio.run(run_evaluator_agent_task())

As per coding guidelines.


108-110: Optional: parameterize queues for stronger typing.

If you run mypy, consider Queue[dict[str, str]] for update_queue and Queue[EvaluationResults] for results_queue to tighten types.

As per coding guidelines.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec51dc and 873c4da.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (26)
  • .github/actions/run-tests/action.yml (1 hunks)
  • .github/workflows/rogue.yml (1 hunks)
  • .github/workflows/test.yml (1 hunks)
  • .gitignore (1 hunks)
  • .vscode/launch.json (2 hunks)
  • examples/mcp/tshirt_store_langgraph_mcp/shirtify_agent.py (2 hunks)
  • lefthook.yaml (1 hunks)
  • packages/tui/internal/tui/app.go (7 hunks)
  • packages/tui/internal/tui/eval_form.go (5 hunks)
  • packages/tui/internal/tui/eval_ui.go (4 hunks)
  • packages/tui/internal/tui/evaluation.go (3 hunks)
  • pyproject.toml (1 hunks)
  • rogue/evaluator_agent/__init__.py (1 hunks)
  • rogue/evaluator_agent/evaluator_agent_factory.py (1 hunks)
  • rogue/evaluator_agent/protocols/__init__.py (1 hunks)
  • rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1 hunks)
  • rogue/evaluator_agent/protocols/base_evaluator_agent.py (8 hunks)
  • rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (1 hunks)
  • rogue/evaluator_agent/run_evaluator_agent.py (7 hunks)
  • rogue/server/core/evaluation_orchestrator.py (4 hunks)
  • rogue/server/services/evaluation_library.py (2 hunks)
  • rogue/server/services/evaluation_service.py (1 hunks)
  • rogue/server/services/scenario_evaluation_service.py (3 hunks)
  • sdks/python/rogue_sdk/__init__.py (1 hunks)
  • sdks/python/rogue_sdk/sdk.py (2 hunks)
  • sdks/python/rogue_sdk/types.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format Python code with Black
Ensure code passes flake8 linting
Run mypy with the repository configuration for static typing
Run Bandit security checks using .bandit.yaml configuration
Use isort import conventions for import ordering
Add type hints to all function signatures
Follow PEP 8 naming (snake_case for variables/functions, PascalCase for classes)
Use try/except around code that may raise exceptions

Files:

  • rogue/evaluator_agent/evaluator_agent_factory.py
  • rogue/server/services/scenario_evaluation_service.py
  • rogue/evaluator_agent/protocols/__init__.py
  • examples/mcp/tshirt_store_langgraph_mcp/shirtify_agent.py
  • rogue/server/services/evaluation_service.py
  • rogue/evaluator_agent/protocols/mcp_evaluator_agent.py
  • rogue/server/services/evaluation_library.py
  • sdks/python/rogue_sdk/sdk.py
  • rogue/evaluator_agent/__init__.py
  • sdks/python/rogue_sdk/__init__.py
  • rogue/server/core/evaluation_orchestrator.py
  • rogue/evaluator_agent/protocols/base_evaluator_agent.py
  • rogue/evaluator_agent/run_evaluator_agent.py
  • rogue/evaluator_agent/protocols/a2a_evaluator_agent.py
  • sdks/python/rogue_sdk/types.py
pyproject.toml

📄 CodeRabbit inference engine (AGENTS.md)

Manage dependencies with uv and declare them in pyproject.toml

Files:

  • pyproject.toml
🧠 Learnings (1)
📚 Learning: 2025-10-05T15:28:00.385Z
Learnt from: CR
PR: qualifire-dev/rogue#0
File: AGENTS.md:0-0
Timestamp: 2025-10-05T15:28:00.385Z
Learning: Applies to **/*.py : Format Python code with Black

Applied to files:

  • lefthook.yaml
🧬 Code graph analysis (15)
rogue/evaluator_agent/evaluator_agent_factory.py (4)
sdks/python/rogue_sdk/types.py (3)
  • Protocol (68-72)
  • Scenarios (163-181)
  • Transport (75-80)
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1)
  • A2AEvaluatorAgent (19-209)
rogue/evaluator_agent/protocols/base_evaluator_agent.py (1)
  • BaseEvaluatorAgent (154-548)
rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (1)
  • MCPEvaluatorAgent (10-143)
rogue/server/services/scenario_evaluation_service.py (2)
sdks/python/rogue_sdk/types.py (5)
  • AuthType (23-48)
  • EvaluationResults (219-235)
  • Protocol (68-72)
  • Scenarios (163-181)
  • Transport (75-80)
rogue/evaluator_agent/run_evaluator_agent.py (2)
  • run_evaluator_agent (220-248)
  • arun_evaluator_agent (75-217)
rogue/evaluator_agent/protocols/__init__.py (2)
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1)
  • A2AEvaluatorAgent (19-209)
rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (1)
  • MCPEvaluatorAgent (10-143)
packages/tui/internal/tui/eval_ui.go (2)
sdks/python/rogue_sdk/types.py (2)
  • Protocol (68-72)
  • Transport (75-80)
packages/tui/internal/tui/evaluation.go (1)
  • EvalScenario (46-50)
packages/tui/internal/tui/app.go (2)
packages/tui/internal/tui/eval_ui.go (2)
  • ProtocolA2A (13-13)
  • TransportHTTP (25-25)
packages/tui/internal/components/llm_config_dialog.go (1)
  • NewLLMConfigDialog (183-270)
rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (3)
sdks/python/rogue_sdk/types.py (2)
  • Scenarios (163-181)
  • Transport (75-80)
rogue/evaluator_agent/protocols/base_evaluator_agent.py (3)
  • BaseEvaluatorAgent (154-548)
  • _send_message_to_evaluated_agent (483-500)
  • _add_message_to_chat_history (511-537)
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1)
  • _send_message_to_evaluated_agent (110-196)
sdks/python/rogue_sdk/sdk.py (1)
sdks/python/rogue_sdk/types.py (5)
  • Protocol (68-72)
  • RogueClientConfig (511-525)
  • Transport (75-80)
  • AuthType (23-48)
  • AgentConfig (86-118)
rogue/evaluator_agent/__init__.py (1)
rogue/evaluator_agent/run_evaluator_agent.py (1)
  • run_evaluator_agent (220-248)
rogue/server/core/evaluation_orchestrator.py (1)
sdks/python/rogue_sdk/types.py (2)
  • Protocol (68-72)
  • Transport (75-80)
rogue/evaluator_agent/protocols/base_evaluator_agent.py (5)
rogue/common/agent_model_wrapper.py (1)
  • get_llm_from_model (11-35)
rogue/evaluator_agent/policy_evaluation.py (1)
  • evaluate_policy (136-204)
sdks/python/rogue_sdk/types.py (5)
  • Scenarios (163-181)
  • EvaluationResults (219-235)
  • ChatHistory (192-200)
  • add_message (197-200)
  • ChatMessage (184-189)
rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (1)
  • _send_message_to_evaluated_agent (74-107)
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (1)
  • _send_message_to_evaluated_agent (110-196)
packages/tui/internal/tui/evaluation.go (2)
packages/tui/internal/tui/eval_ui.go (2)
  • Protocol (10-10)
  • Transport (17-17)
sdks/python/rogue_sdk/types.py (3)
  • Protocol (68-72)
  • Transport (75-80)
  • AuthType (23-48)
rogue/evaluator_agent/run_evaluator_agent.py (3)
sdks/python/rogue_sdk/types.py (5)
  • AuthType (23-48)
  • EvaluationResults (219-235)
  • Protocol (68-72)
  • Scenarios (163-181)
  • Transport (75-80)
rogue/common/agent_sessions.py (1)
  • create_session (8-18)
rogue/evaluator_agent/evaluator_agent_factory.py (1)
  • get_evaluator_agent (10-52)
rogue/evaluator_agent/protocols/a2a_evaluator_agent.py (4)
sdks/python/rogue_sdk/types.py (1)
  • Scenarios (163-181)
rogue/common/remote_agent_connection.py (1)
  • RemoteAgentConnections (52-130)
rogue/evaluator_agent/protocols/base_evaluator_agent.py (3)
  • BaseEvaluatorAgent (154-548)
  • _send_message_to_evaluated_agent (483-500)
  • _add_message_to_chat_history (511-537)
rogue/evaluator_agent/protocols/mcp_evaluator_agent.py (1)
  • _send_message_to_evaluated_agent (74-107)
sdks/python/rogue_sdk/types.py (4)
packages/tui/internal/tui/eval_ui.go (2)
  • Protocol (10-10)
  • Transport (17-17)
packages/tui/internal/tui/evaluation.go (1)
  • AgentConfig (34-44)
packages/sdk/src/types.ts (1)
  • AgentConfig (29-39)
rogue/models/cli_input.py (1)
  • model_post_init (67-75)
packages/tui/internal/tui/eval_form.go (1)
packages/tui/internal/styles/styles.go (3)
  • NewStyle (45-47)
  • TextMuted (22-22)
  • Primary (13-13)
🔇 Additional comments (28)
.github/workflows/rogue.yml (1)

44-44: Venv activation approach aligns with SDK installation intent.

The shift from uv run to direct venv activation ensures the locally-installed SDK (installed at lines 31-34) is used rather than triggering a PyPI reinstall. The pattern is consistent with the explicit --force-reinstall strategy in test.yml and the error-handling model (using && for command chaining).

.github/actions/run-tests/action.yml (1)

14-14: Direct pytest invocation is appropriate; relies on prior installation.

The removal of uv run --dev is correct since dev dependencies are already installed via uv sync --dev in the calling workflow (test.yml line 39). Pytest and coverage should be available in the activated venv. The && chaining ensures test failures prevent artifact uploads.

.github/workflows/test.yml (1)

36-42: Installation order correctly prioritizes local SDK over PyPI.

The consolidated installation sequence is well-structured: uv sync --dev establishes dependencies, then root + local SDK are installed after removing the PyPI version. The --force-reinstall flag ensures the local SDK takes precedence. Single run block ensures atomic execution with proper fail-fast semantics (bash exits on first error). This pattern aligns with rogue.yml's rationale for avoiding uv-driven PyPI reinstalls.

.vscode/launch.json (2)

47-54: Both launch configurations verified—paths are valid and properly configured.

The Rogue TUI path (packages/tui/cmd/rogue/main.go) exists and contains a valid Go entry point. The renamed MCP directory (examples/mcp/tshirt_store_langgraph_mcp) exists with correct configuration files, and the rename has been applied consistently throughout the codebase.


77-78: The MCP debug configuration paths are correctly configured.

The renamed directory path ./examples/mcp/tshirt_store_langgraph_mcp exists and is properly set up. The envFile path is also correctly configured—.env files are intentionally excluded from the repository (per .gitignore) and are meant to be created locally by developers based on the .env.example template. This follows the same pattern used for the tshirt_store_agent example, confirming consistency across the codebase. No issues found.

examples/mcp/tshirt_store_langgraph_mcp/shirtify_agent.py (1)

4-4: No issues found — create_agent import and API usage are correct.

The import change from create_react_agent to create_agent is valid. The function exists in langchain.agents, and the code correctly uses all parameters (model, system_prompt, tools, response_format, checkpointer) as documented.

lefthook.yaml (1)

33-33: LGTM! Black-compatible import sorting.

This ensures isort uses Black-compatible sorting, maintaining consistent formatting across the codebase. Based on coding guidelines.

rogue/server/services/evaluation_library.py (2)

58-59: LGTM! Protocol and transport properly propagated.

The protocol and transport configuration from agent_config is correctly passed to ScenarioEvaluationService, maintaining consistency with the new protocol/transport architecture.


151-152: LGTM! Consistent with evaluate_agent.

The streaming method correctly mirrors the protocol and transport propagation pattern used in the non-streaming evaluate_agent method.

rogue/server/services/scenario_evaluation_service.py (2)

12-13: LGTM! Protocol and transport properly integrated.

The service correctly accepts and stores protocol and transport configuration as private attributes, following the established naming pattern.

Also applies to: 23-24


62-63: LGTM! Consistent propagation to evaluator agent.

Protocol and transport are correctly passed to arun_evaluator_agent, completing the configuration chain from the evaluation request through to agent instantiation.

rogue/server/services/evaluation_service.py (1)

99-100: LGTM! Protocol and transport wired into orchestrator.

The configuration is correctly propagated to EvaluationOrchestrator, maintaining consistency with the library-based evaluation path in evaluation_library.py.

pyproject.toml (1)

13-13: fastmcp version 2.12.5 verified—no issues found.

Version 2.12.5 exists on PyPI and has no known security vulnerabilities according to GitHub's security advisory database. The dependency specification is valid.

rogue/evaluator_agent/__init__.py (1)

1-8: Verification passed—no breaking imports detected.

The rg searches found zero matches, confirming that no code directly imports the removed evaluator_agent module. The refactoring to protocol-based agent selection via evaluator_agent_factory and protocols exports is safe and complete.

rogue/evaluator_agent/evaluator_agent_factory.py (1)

24-36: Factory A2A branch looks correct

Arguments map cleanly to A2AEvaluatorAgent; no issues spotted.

packages/tui/internal/tui/eval_form.go (1)

174-174: Confirm field index for Start button (5) stays in sync with form fields

You now render 5 fields (0..4). Ensure upstream navigation logic caps at 5 and updates if fields change.

Would you like a quick PR guard (test) to fail if currentField exceeds the last control index?

rogue/server/core/evaluation_orchestrator.py (1)

85-96: Forwarding protocol/transport looks correct

arun_evaluator_agent now receives both; no further issues here.

sdks/python/rogue_sdk/types.py (3)

31-49: Auth header now accepts SecretStr — good change

Secret normalization avoids leaking secrets and keeps Bandit happy. LGTM.


68-81: Protocol/Transport enums added — values align with client

Lowercase values (a2a, mcp, streamable_http, sse) are consistent; no issues.


89-91: Sane defaults for MCP transport

Defaulting transport to STREAMABLE_HTTP when protocol is MCP is helpful; minimal and explicit.

Also applies to: 116-119

packages/tui/internal/tui/evaluation.go (1)

421-432: No issues found—Protocol/Transport constants are correctly defined.

Verification confirms that all Protocol and Transport string constants in packages/tui/internal/tui/eval_ui.go are properly cased (lowercase) and match server enum expectations:

  • Protocol: "a2a", "mcp"
  • Transport: "sse", "streamable_http", "http"

No casing mismatches exist.

packages/tui/internal/tui/app.go (1)

526-535: Defaults for new evaluations look sensible (A2A/HTTP).

Good initialization of AgentProtocol/AgentTransport and cursor position.

Ensure eval_form rendering reflects six fields and indices 0–5 consistently.

Also applies to: 917-927

packages/tui/internal/tui/eval_ui.go (1)

10-26: Verification confirmed: StartEvaluation signature matches perfectly. All Protocol/Transport wiring, constants, helper functions, and cycling logic are correct and properly integrated. The review comment's verification request has been satisfied.

rogue/evaluator_agent/protocols/base_evaluator_agent.py (1)

226-229: Confirm FunctionTool supports async callables for tools.

FunctionTool(func=self._send_message_to_evaluated_agent) points to an async def. If FunctionTool doesn’t await coroutines, this will return a coroutine object at runtime. Please confirm ADK supports async tool functions or wrap with a small sync shim that runs the coroutine appropriately.

As per coding guidelines.

rogue/evaluator_agent/run_evaluator_agent.py (4)

76-78: Protocol/transport surfaced in public API: good abstraction boundary.

Threading protocol and transport into the async entrypoint is clean and future‑proof.


111-124: Factory‑based evaluator creation looks correct and extensible.

Passing headers/auth, deep_test_mode, and the chat callback through the factory keeps concerns separated and enables new protocols/transports later.


126-136: Async context management for evaluator lifecycle is appropriate.

Wrapping the evaluator in async with pairs nicely with protocol‑specific resources.


21-32: Deferred import and Content/Part usage LGTM.

Local import reduces startup time and the content construction matches ADK expectations.

@yuval-qf yuval-qf force-pushed the feature/FIRE-812-mcp-tui branch from 873c4da to 8d25c5f Compare October 23, 2025 09:24
@yuval-qf yuval-qf had a problem deploying to rogue-sanity-ci-secrets October 23, 2025 09:24 — with GitHub Actions Failure
@yuval-qf yuval-qf temporarily deployed to rogue-sanity-ci-secrets October 23, 2025 09:27 — with GitHub Actions Inactive
@yuval-qf
Copy link
Collaborator Author

@coderabbitai review

@yuval-qf yuval-qf temporarily deployed to rogue-sanity-ci-secrets October 23, 2025 09:47 — with GitHub Actions Inactive
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
packages/tui/internal/tui/keyboard_controller.go (2)

131-151: Bug: “/” not insertable in JudgeModel field.

Gate uses currentField <= 1, which excludes field 3 (JudgeModel). Result: pressing “/” opens command input instead of inserting.

- if m.currentScreen == NewEvaluationScreen && m.evalState != nil && m.evalState.currentField <= 1 {
+ if m.currentScreen == NewEvaluationScreen && m.evalState != nil &&
+   (m.evalState.currentField == 0 || m.evalState.currentField == 3) {

98-124: Backspace blocked on AgentURL and unsafe at cursor 0.

The guard if m.evalState.currentField > 0 skips AgentURL (0). Also deletion when cursorPos == 0 will slice with -1.

- if m.evalState.currentField > 0 {
+ if m.evalState.currentField >= 0 {
     switch m.evalState.currentField {
     case 0: // AgentURL
-        runes := []rune(m.evalState.AgentURL)
-        if m.evalState.cursorPos <= len(runes) {
+        runes := []rune(m.evalState.AgentURL)
+        if m.evalState.cursorPos > 0 && m.evalState.cursorPos <= len(runes) {
             m.evalState.AgentURL = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:])
             m.evalState.cursorPos--
         }
     case 3: // JudgeModel
-        runes := []rune(m.evalState.JudgeModel)
-        if m.evalState.cursorPos <= len(runes) {
+        runes := []rune(m.evalState.JudgeModel)
+        if m.evalState.cursorPos > 0 && m.evalState.cursorPos <= len(runes) {
             m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:])
             m.evalState.cursorPos--
         }

Also remove unreachable case 6 (ParallelRuns) or update its index to match the form.

packages/tui/internal/tui/evaluation.go (1)

351-356: Cancellation can hang; also close WebSocket.

ReadJSON blocks; calling CancelEvaluation may not unblock the reader. Close the WS in cancel to force exit and prevent goroutine leak.

- cancel := func() error {
-   return sdk.CancelEvaluation(context.Background(), jobID)
- }
+ cancel := func() error {
+   // Close WS first to unblock ReadJSON
+   _ = sdk.CloseWebSocket()
+   return sdk.CancelEvaluation(context.Background(), jobID)
+ }

Optionally set read deadlines and ping/pong handlers after connecting to improve robustness.

packages/tui/internal/tui/eval_form_controller.go (1)

98-124: Same backspace defects as keyboard controller.

  • Guard > 0 blocks AgentURL.
  • Slicing without checking cursorPos > 0 can panic.

Apply the same fix as suggested in keyboard_controller.go and delete the unreachable case 6.

- if m.evalState.currentField > 0 {
+ if m.evalState.currentField >= 0 {
   switch m.evalState.currentField {
   case 0:
-    runes := []rune(m.evalState.AgentURL)
-    if m.evalState.cursorPos <= len(runes) {
+    runes := []rune(m.evalState.AgentURL)
+    if m.evalState.cursorPos > 0 && m.evalState.cursorPos <= len(runes) {
       m.evalState.AgentURL = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:])
       m.evalState.cursorPos--
     }
   case 3:
-    runes := []rune(m.evalState.JudgeModel)
-    if m.evalState.cursorPos <= len(runes) {
+    runes := []rune(m.evalState.JudgeModel)
+    if m.evalState.cursorPos > 0 && m.evalState.cursorPos <= len(runes) {
       m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:])
       m.evalState.cursorPos--
     }
-  case 6:
-    ...
   }
🧹 Nitpick comments (10)
examples/mcp/tshirt_store_langgraph_mcp/__main__.py (1)

14-20: Consider using choices= for more idiomatic Click code.

While type=click.Choice() is valid, the choices parameter is the more conventional approach for @click.option() when defining simple choice constraints.

Apply this diff for a more idiomatic implementation:

 @click.option(
     "--transport",
     "transport",
     default="streamable-http",
-    type=click.Choice(["streamable-http", "sse"]),
+    choices=["streamable-http", "sse"],
     help="Transport to use for the mcp server",
 )
rogue/evaluator_agent/base_evaluator_agent.py (2)

362-370: Tool signature vs instructions mismatch; consider updating docs.

You added scenario_type: Optional[str] and **kwargs to _log_evaluation, but AGENT_INSTRUCTIONS still list 4 params. This won’t break (arg is optional), but updating the instructions avoids confusion. Also consider logging kwargs in debug for traceability.


386-411: Type flexibility for scenario payload.

scenario: dict[str, str] is strict; you already normalize and validate with Pydantic. Consider dict[str, Any] to reflect reality and quiet type checkers when extra fields are present. No behavior change.

-    def _log_evaluation(
-        self,
-        scenario: dict[str, str],
+    def _log_evaluation(
+        self,
+        scenario: dict[str, Any],

Also applies to: 412-433

packages/tui/internal/tui/common_controller.go (1)

48-59: Paste routing: good; also strip CR and dedupe cleaner.

Currently you strip only "\n". Replace CRLF and reuse a small helper to avoid duplicating the cleanup logic in multiple branches.

- cleanText := strings.TrimSpace(strings.ReplaceAll(string(msg), "\n", ""))
+ raw := string(msg)
+ raw = strings.ReplaceAll(raw, "\r\n", "\n")
+ cleanText := strings.TrimSpace(strings.ReplaceAll(raw, "\n", ""))

Optionally extract a cleanClipboard(string) string helper and call it from both dialog and new‑evaluation branches.

packages/tui/internal/tui/keyboard_controller.go (2)

65-81: Use rune length for cursor movement; avoid byte/rune mismatch.

Right-arrow uses len(string) (bytes). Use rune slices to match insertion/deletion logic.

- case 0: // AgentURL text field
-   fieldLen := len(m.evalState.AgentURL)
+ case 0: // AgentURL text field
+   fieldLen := len([]rune(m.evalState.AgentURL))
    if m.evalState.cursorPos < fieldLen {
      m.evalState.cursorPos++
    }
...
- case 3: // JudgeModel text field
-   fieldLen := len(m.evalState.JudgeModel)
+ case 3: // JudgeModel text field
+   fieldLen := len([]rune(m.evalState.JudgeModel))
    if m.evalState.cursorPos < fieldLen {
      m.evalState.cursorPos++
    }

80-91: Defaults wiring OK. Consider centralizing field indices.

Initialization matches the new 6‑field layout. To prevent future index drift across files, define constants (e.g., FieldAgentURL = 0, FieldProtocol = 1, …) in one place and reuse.

packages/tui/internal/tui/eval_form.go (1)

230-237: Help text: add Tab hint for LLM config.

Since Tab opens the Judge LLM dialog, surface it in the footer to aid discovery.

- helpText := helpStyle.Render("t Test Server   ↑/↓ switch fields   ←/→ move cursor/cycle dropdown    Space toggle   Enter activate   Esc Back")
+ helpText := helpStyle.Render("t Test Server   ↑/↓ switch fields   ←/→ move cursor/cycle dropdown   Space toggle   Tab configure LLM   Enter activate   Esc Back")

Also applies to: 251-252

packages/tui/internal/tui/eval_form_controller.go (3)

52-63: Arrow handling: good, but switch to rune length for Right.

Left correctly guards >0. For Right, use rune length for both text fields to match insertion/deletion behavior.

- fieldLen := len(m.evalState.AgentURL)
+ fieldLen := len([]rune(m.evalState.AgentURL))
...
- fieldLen := len(m.evalState.JudgeModel)
+ fieldLen := len([]rune(m.evalState.JudgeModel))

Also applies to: 65-81


126-149: Character insertion: fine; consider rejecting control runes.

len(s)==1 lets in any printable, but might include control characters in some terminals. Optionally guard by checking unicode.IsPrint.


16-21: Avoid magic indices across files.

Define shared field index constants and reuse to reduce drift between controller, keyboard, and renderer.

Also applies to: 37-50

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 873c4da and 1595501.

📒 Files selected for processing (10)
  • .gitignore (1 hunks)
  • .vscode/launch.json (1 hunks)
  • examples/mcp/tshirt_store_langgraph_mcp/__main__.py (1 hunks)
  • packages/tui/internal/tui/common_controller.go (2 hunks)
  • packages/tui/internal/tui/eval_form.go (5 hunks)
  • packages/tui/internal/tui/eval_form_controller.go (4 hunks)
  • packages/tui/internal/tui/eval_ui.go (4 hunks)
  • packages/tui/internal/tui/evaluation.go (3 hunks)
  • packages/tui/internal/tui/keyboard_controller.go (3 hunks)
  • rogue/evaluator_agent/base_evaluator_agent.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/tui/internal/tui/eval_ui.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format Python code with Black
Ensure code passes flake8 linting
Run mypy with the repository configuration for static typing
Run Bandit security checks using .bandit.yaml configuration
Use isort import conventions for import ordering
Add type hints to all function signatures
Follow PEP 8 naming (snake_case for variables/functions, PascalCase for classes)
Use try/except around code that may raise exceptions

Files:

  • examples/mcp/tshirt_store_langgraph_mcp/__main__.py
  • rogue/evaluator_agent/base_evaluator_agent.py
🧬 Code graph analysis (5)
packages/tui/internal/tui/evaluation.go (2)
packages/tui/internal/tui/eval_ui.go (2)
  • Protocol (10-10)
  • Transport (17-17)
sdks/python/rogue_sdk/types.py (3)
  • Protocol (68-79)
  • Transport (82-93)
  • AuthType (23-48)
packages/tui/internal/tui/eval_form.go (1)
packages/tui/internal/styles/styles.go (3)
  • NewStyle (45-47)
  • TextMuted (22-22)
  • Primary (13-13)
packages/tui/internal/tui/eval_form_controller.go (1)
packages/tui/internal/components/llm_config_dialog.go (1)
  • NewLLMConfigDialog (183-270)
packages/tui/internal/tui/keyboard_controller.go (1)
packages/tui/internal/tui/eval_ui.go (3)
  • EvaluationViewState (29-56)
  • ProtocolA2A (13-13)
  • TransportHTTP (25-25)
packages/tui/internal/tui/common_controller.go (1)
packages/tui/internal/tui/eval_ui.go (3)
  • EvaluationViewState (29-56)
  • ProtocolA2A (13-13)
  • TransportHTTP (25-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: codestyle
  • GitHub Check: rogue_sanity
🔇 Additional comments (10)
.gitignore (1)

94-95: Ignore Go TUI debug artifacts

The new ignore rule cleanly keeps local Go TUI debugger files out of source control. Looks good.

.vscode/launch.json (1)

66-73: Go launch config looks good; verify Delve/cwd behavior.

Add "cwd": "${workspaceFolder}" if you see path issues, and confirm Go extension uses dlv with mode "auto" on your setup. Otherwise LGTM.

packages/tui/internal/tui/common_controller.go (1)

203-214: Defaults for Protocol/Transport: sensible.

Initializing AgentProtocol=ProtocolA2A and AgentTransport=TransportHTTP aligns with server defaults. Keep the TODO to load from user config.

packages/tui/internal/tui/keyboard_controller.go (1)

271-283: Enter handling aligns with new Start/Judge fields.

Start on 5 and open LLM config on 3 looks correct.

packages/tui/internal/tui/evaluation.go (2)

35-44: AgentConfig extension: protocol/transport JSON keys look right.

Matches server expectations (“protocol”, “transport”). Good addition.


421-469: StartEvaluation propagation: correct.

Validates URLs and populates the new fields in the request. LGTM.

packages/tui/internal/tui/eval_form.go (2)

40-84: Text field renderer: solid and rune‑safe.

Inline cursor with rune slicing is correct. Nice.


86-124: Dropdown and toggle renderers look good.

Clear focus treatment and indicators. No issues.

Also applies to: 126-161

packages/tui/internal/tui/eval_form_controller.go (2)

22-35: Field navigation: OK.

Up/Down remap to 0–5 with cursor placement at end for text fields looks correct.

Also applies to: 37-50


10-21: Health check shortcut is handy.

Spinner + async health check wiring is neat.

Comment on lines 84 to 88
case "space":
if m.evalState.currentField == 2 { // DeepTest field is now index 2
if m.evalState.currentField == 4 { // DeepTest field is now index 2
m.evalState.DeepTest = !m.evalState.DeepTest
return m, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update comment for DeepTest index.

Keep comment consistent with actual index 4.

- if m.evalState.currentField == 4 { // DeepTest field is now index 2
+ if m.evalState.currentField == 4 { // DeepTest field index
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case "space":
if m.evalState.currentField == 2 { // DeepTest field is now index 2
if m.evalState.currentField == 4 { // DeepTest field is now index 2
m.evalState.DeepTest = !m.evalState.DeepTest
return m, nil
}
case "space":
if m.evalState.currentField == 4 { // DeepTest field index
m.evalState.DeepTest = !m.evalState.DeepTest
return m, nil
}
🤖 Prompt for AI Agents
In packages/tui/internal/tui/eval_form_controller.go around lines 84 to 88, the
inline comment stating "DeepTest field is now index 2" is incorrect and should
reflect the actual index 4; update the comment to "DeepTest field is now index
4" (or remove the outdated comment) so it matches the code and avoids confusion.

Comment on lines +84 to +88
AgentProtocol: ProtocolA2A,
AgentTransport: TransportHTTP,
JudgeModel: judgeModel,
ParallelRuns: 1,
DeepTest: false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stale comment.

Comment says “DeepTest field is now index 2” but code uses index 4. Update the comment to avoid confusion.

- if m.evalState.currentField == 4 { // DeepTest field is now index 2
+ if m.evalState.currentField == 4 { // DeepTest field index

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In packages/tui/internal/tui/keyboard_controller.go around lines 84 to 88, the
inline comment referring to the field index for DeepTest is stale (it says
"DeepTest field is now index 2" while the code shows it at index 4); update that
comment to reflect the current index (4) or remove the misleading index mention
so it no longer conflicts with the actual struct field order.

@yuval-qf yuval-qf merged commit 419d3c5 into main Oct 23, 2025
9 checks passed
@yuval-qf yuval-qf deleted the feature/FIRE-812-mcp-tui branch October 23, 2025 21:19
ignorant05 pushed a commit to ignorant05/rogue that referenced this pull request Oct 24, 2025
* Add protocol and transport settings to tui

* Fix mcp example command

* Fix log_evaluation kwargs
@coderabbitai coderabbitai bot mentioned this pull request Nov 10, 2025
21 tasks
@drorIvry drorIvry mentioned this pull request Nov 10, 2025
21 tasks
@coderabbitai coderabbitai bot mentioned this pull request Nov 20, 2025
21 tasks
This was referenced Dec 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants