-
Notifications
You must be signed in to change notification settings - Fork 146
feature/python ts entrypoint #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a Python protocol that runs user-supplied Python entrypoint modules (call_agent), with CLI/TUI/SDK/server wiring, new Python evaluator and red-team agents, config persistence, UI form changes, and protocol-aware request/evaluation flow. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI_TUI as CLI / TUI
participant Validator as ping_python_entrypoint
participant SDK as RogueSDK
participant Orchestrator as EvaluationOrchestrator
participant Agent as PythonEvaluatorAgent
participant Module as User Python Module
User->>CLI_TUI: provide python_entrypoint_file
CLI_TUI->>Validator: validate file & check call_agent
Validator-->>CLI_TUI: validation result
CLI_TUI->>SDK: run_evaluation(python_entrypoint_file)
SDK->>Orchestrator: create orchestrator (python_entrypoint_file)
Orchestrator->>Agent: arun_evaluator_agent(..., python_entrypoint_file)
Agent->>Agent: __aenter__ → _load_python_module()
Agent->>Module: import module, bind call_agent
Module-->>Agent: call_agent available
loop interactions
Agent->>Agent: _send_message_to_evaluated_agent(messages)
Agent->>Module: invoke call_agent(messages[, context_id])
Module-->>Agent: string response (sync/async)
Agent-->>Orchestrator: evaluation event / result
end
Agent->>Agent: __aexit__ cleanup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks❌ Failed checks (2 warnings, 1 inconclusive)
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
rogue/server/websocket/manager.py (1)
46-91: Replace WebSocket ping implementation or clarify it as application-level heartbeat.FastAPI/Starlette WebSocket API does not provide a method to send protocol-level ping/pong frames. The
send_bytes(b"ping")at line 69 sends binary application data, not a WebSocket control frame. This means the keepalive mechanism won't prevent timeout errors at the protocol level as intended. Either:
- Document this as an application-level heartbeat and verify the gorilla/websocket client expects this pattern
- Switch to a WebSocket library that supports real ping/pong frames (e.g.,
websocketslibrary)- Update the docstring and comments to remove references to "WebSocket ping frames" and "protocol-level" ping/pong
Also remove or correct the misleading comment at line 66: "FastAPI/Starlette WebSocket uses send() with bytes for ping" — this is not accurate.
rogue/evaluator_agent/run_evaluator_agent.py (1)
270-300: Addpython_entrypoint_fileparameter to the synchronous wrapper.The synchronous wrapper
run_evaluator_agentdoesn't accept thepython_entrypoint_fileparameter, making it unusable for Python protocol evaluations. This creates an inconsistency in the API surface where the async version supports Python protocol but the sync version does not.🔎 Proposed fix
def run_evaluator_agent( protocol: Protocol, transport: Optional[Transport], evaluated_agent_url: str, auth_type: AuthType, auth_credentials: Optional[str], judge_llm: str, judge_llm_api_key: Optional[str], scenarios: Scenarios, business_context: str, deep_test_mode: bool, + python_entrypoint_file: Optional[str] = None, ) -> EvaluationResults: """Synchronous wrapper for arun_evaluator_agent.""" async def run_evaluator_agent_task(): async for update_type, data in arun_evaluator_agent( protocol=protocol, transport=transport, evaluated_agent_url=evaluated_agent_url, auth_type=auth_type, auth_credentials=auth_credentials, judge_llm=judge_llm, judge_llm_api_key=judge_llm_api_key, scenarios=scenarios, business_context=business_context, deep_test_mode=deep_test_mode, + python_entrypoint_file=python_entrypoint_file, ): if update_type == "results": return datarogue/run_cli.py (1)
597-600: Error message may be confusing for Python protocol.When using Python protocol,
cli_input.evaluated_agent_urlcould beNone, producing a confusing error message. Consider usingevaluated_agent_identifierinstead.🔎 Suggested fix
if not results: raise ValueError( - f"No scenarios were evaluated for {cli_input.evaluated_agent_url}", + f"No scenarios were evaluated for {evaluated_agent_identifier}", )
🧹 Nitpick comments (12)
README.md (1)
84-98: Document error handling and invocation contract forcall_agent.The function signature and basic example are clear, but the documentation should clarify:
- Error handling: What happens if
call_agentraises an exception?- Invocation contract: Is
call_agentcalled once per user message or once per evaluation session?- State management: Should the function maintain state between calls, or is each call independent?
- Timeout/performance: Are there timeout constraints or expected response times?
Consider adding these details to prevent misunderstandings when users implement their agents.
rogue/server/websocket/manager.py (1)
111-116: Simplify the connection check.The length check is redundant because
disconnect()(line 104) removes thejob_idkey when the list becomes empty. If the key exists in the dictionary, it will always have at least one connection.🔎 Simplified implementation
def has_connections(self, job_id: str) -> bool: """Check if there are any active WebSocket connections for a job.""" - return ( - job_id in self.active_connections - and len(self.active_connections[job_id]) > 0 - ) + return job_id in self.active_connectionsexamples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py (1)
51-52: Consider defensive access for message content.If a malformed message dict without a
"content"key is passed, this will raise aKeyError. While the contract specifies messages should have'role'and'content'keys, defensive access could provide a clearer error:🔎 Optional defensive fix
# Get the latest user message - latest_message = messages[-1]["content"] + latest_message = messages[-1].get("content", "")rogue/evaluator_agent/red_team/base_red_team_attacker_agent.py (1)
202-218: Consider log level for message content.The
logger.infocalls at lines 202 and 213 log message content previews. While helpful for debugging, INFO-level logging of attack messages and responses could be verbose in production and might inadvertently log sensitive content.🔎 Optional: Use DEBUG level for message content
# Queue outgoing message (attack) - use "Rogue" role for TUI display - logger.info(f"📤 Queueing outgoing attack message: {message[:50]}...") + logger.debug(f"📤 Queueing outgoing attack message: {message[:50]}...") await update_queue.put( ChatUpdate(role="Rogue", content=message), ) # Yield control to allow queue consumer to process await asyncio.sleep(0) # Send the actual message response = await self._send_message_to_evaluated_agent(message, session_id) # Queue response - use "Agent Under Test" role for TUI display - logger.info(f"📥 Queueing agent response: {response[:50]}...") + logger.debug(f"📥 Queueing agent response: {response[:50]}...")The
asyncio.sleep(0)calls are a good pattern for yielding control to allow the queue consumer to process updates in a timely manner.packages/tui/internal/tui/keyboard_controller.go (1)
162-187: Consider refactoring to reduce code duplication.The cursor clamping and string insertion logic is duplicated between the Python and non-Python branches. Extracting this into a helper function would improve maintainability.
🔎 Suggested refactor
// Helper function to insert character at cursor position func insertCharAtCursor(s string, cursorPos *int, char string) string { runes := []rune(s) // Clamp cursor position to valid range if *cursorPos > len(runes) { *cursorPos = len(runes) } result := string(runes[:*cursorPos]) + char + string(runes[*cursorPos:]) (*cursorPos)++ return result } // Then in the switch case: case EvalFieldAgentURL: // Use Python file or Agent URL based on protocol if m.evalState.AgentProtocol == ProtocolPython { m.evalState.PythonEntrypointFile = insertCharAtCursor( m.evalState.PythonEntrypointFile, &m.evalState.cursorPos, s, ) } else { m.evalState.AgentURL = insertCharAtCursor( m.evalState.AgentURL, &m.evalState.cursorPos, s, ) } // Save config after text change go saveUserConfig(...)packages/tui/internal/screens/evaluations/form_view.go (1)
283-299: Consider using constants for field indices.The hardcoded field indices (0, 1, 2, 3, 4, 5) make the code fragile and harder to maintain. If field ordering changes or new fields are added, these magic numbers would need to be updated in multiple places.
Consider defining constants or using the
FormFieldenum values referenced elsewhere in the codebase.Example approach
// Define constants at package level or reuse existing FormField enum const ( FieldIndexProtocol = 0 FieldIndexAgentURLOrPython = 1 FieldIndexTransport = 2 FieldIndexJudgeLLM = 3 FieldIndexDeepTest = 4 FieldIndexMode = 5 ) // Then use them in rendering: renderDropdownField(FieldIndexProtocol, "Protocol:", protocol), renderTextField(FieldIndexAgentURLOrPython, "Python File:", pythonFile), // ...rogue/server/services/red_team_service.py (1)
107-124: Consider exponential backoff for WebSocket connection wait.The WebSocket wait logic uses a fixed 0.1s interval in a busy-wait loop. While functional, exponential backoff would be more efficient and reduce CPU usage during the wait period.
🔎 Proposed refactor with exponential backoff
- # Wait for WebSocket client to connect before sending updates - # This ensures chat messages aren't lost - max_wait = 5.0 # seconds - wait_interval = 0.1 - waited = 0.0 - while waited < max_wait: - if self.websocket_manager.has_connections(job_id): - logger.info( - f"WebSocket client connected for job {job_id}, starting scan", - ) - break - await asyncio.sleep(wait_interval) - waited += wait_interval + # Wait for WebSocket client with exponential backoff + max_wait = 5.0 # seconds + wait_interval = 0.05 + max_interval = 0.5 + waited = 0.0 + while waited < max_wait: + if self.websocket_manager.has_connections(job_id): + logger.info( + f"WebSocket client connected for job {job_id}, starting scan", + ) + break + await asyncio.sleep(wait_interval) + waited += wait_interval + wait_interval = min(wait_interval * 1.5, max_interval) else: logger.warning( f"No WebSocket client connected after {max_wait}s, " f"proceeding anyway for job {job_id}", )packages/tui/internal/tui/form_controller.go (1)
220-258: Consider extracting text field handling into a helper.The backspace handling has duplicated logic for
PythonEntrypointFileandAgentURLwith identical patterns (cursor clamping, string manipulation, cursor decrement, config save). The same pattern appears in the default case for character insertion (lines 279-303).🔎 Potential helper function extraction
A helper function could reduce duplication:
// Helper to handle text field editing func (m *Model) editTextField(field *string, cursorPos *int, op func(runes []rune, pos int) ([]rune, int)) { runes := []rune(*field) if *cursorPos > len(runes) { *cursorPos = len(runes) } newRunes, newPos := op(runes, *cursorPos) *field = string(newRunes) *cursorPos = newPos }This is optional and can be deferred to a later refactor.
rogue/evaluator_agent/evaluator_agent_factory.py (1)
76-101: LGTM! Python protocol handling in factory.The factory correctly:
- Validates
python_entrypoint_fileis provided for PYTHON protocol- Creates
PythonEvaluatorAgentwith the correct parameters- Uses debug logging for troubleshooting
Minor suggestion: The
from loguru import loggerimport at line 78 could be moved to module level (line 11) for consistency with other factory files, but this is a stylistic preference.packages/tui/internal/tui/utils.go (1)
594-607: Variadic parameter for optional value is unconventional.Using
pythonEntrypointFile ...stringfor a single optional parameter works but is unusual in Go. More idiomatic approaches would be:
- Using a pointer:
pythonEntrypointFile *string- Using an options struct
- Using empty string as "not provided" indicator
However, the current approach is functional and doesn't cause issues.
packages/tui/internal/tui/eval_types.go (1)
142-144: Consider logging errors for debuggability.The
os.Getwd()error is silently ignored. While returning an empty config is acceptable graceful degradation, logging this failure would help users debug config loading issues.🔎 Suggested improvement
func loadUserConfigFromWorkdir() UserConfigFromFile { - wd, _ := os.Getwd() + wd, err := os.Getwd() + if err != nil { + return UserConfigFromFile{} + } dir := wdrogue/run_cli.py (1)
481-486: Consider movingastimport to module level.Per PEP 8 and isort conventions, imports should typically be at module level. Since
astis a standard library module with no overhead, it can be moved to the top.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
README.mdexamples/python_entrypoint_stub.pyexamples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.pypackages/tui/internal/screens/evaluations/form_view.gopackages/tui/internal/screens/evaluations/types.gopackages/tui/internal/tui/common_controller.gopackages/tui/internal/tui/eval_form_view.gopackages/tui/internal/tui/eval_types.gopackages/tui/internal/tui/form_controller.gopackages/tui/internal/tui/keyboard_controller.gopackages/tui/internal/tui/utils.gorogue/evaluator_agent/base_evaluator_agent.pyrogue/evaluator_agent/evaluator_agent_factory.pyrogue/evaluator_agent/python/__init__.pyrogue/evaluator_agent/python/python_evaluator_agent.pyrogue/evaluator_agent/red_team/base_red_team_attacker_agent.pyrogue/evaluator_agent/red_team/factory.pyrogue/evaluator_agent/red_team/python_red_team_attacker_agent.pyrogue/evaluator_agent/run_evaluator_agent.pyrogue/models/cli_input.pyrogue/run_cli.pyrogue/server/core/evaluation_orchestrator.pyrogue/server/core/red_team_orchestrator.pyrogue/server/services/evaluation_service.pyrogue/server/services/red_team_service.pyrogue/server/websocket/manager.pysdks/python/rogue_sdk/sdk.pysdks/python/rogue_sdk/types.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Useblackfor code formatting
Use type hints for all function signatures
Follow PEP 8 naming conventions (snake_case for variables and functions, PascalCase for classes)
Use try/except blocks for code that may raise exceptions
Useisortconventions for import organization
Files:
rogue/evaluator_agent/python/__init__.pyexamples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.pyrogue/evaluator_agent/run_evaluator_agent.pysdks/python/rogue_sdk/sdk.pyrogue/server/services/evaluation_service.pyrogue/evaluator_agent/base_evaluator_agent.pyrogue/server/core/evaluation_orchestrator.pysdks/python/rogue_sdk/types.pyrogue/evaluator_agent/red_team/base_red_team_attacker_agent.pyrogue/evaluator_agent/red_team/python_red_team_attacker_agent.pyrogue/run_cli.pyexamples/python_entrypoint_stub.pyrogue/models/cli_input.pyrogue/server/websocket/manager.pyrogue/server/core/red_team_orchestrator.pyrogue/evaluator_agent/evaluator_agent_factory.pyrogue/evaluator_agent/python/python_evaluator_agent.pyrogue/evaluator_agent/red_team/factory.pyrogue/server/services/red_team_service.py
🧬 Code graph analysis (16)
rogue/evaluator_agent/python/__init__.py (1)
rogue/evaluator_agent/python/python_evaluator_agent.py (1)
PythonEvaluatorAgent(20-283)
packages/tui/internal/tui/form_controller.go (2)
packages/tui/internal/tui/eval_types.go (9)
ProtocolPython(17-17)EvalFieldTransport(48-48)EvalFieldAgentURL(47-47)EvaluationMode(31-31)EvalFieldEvaluationMode(51-51)EvalFieldScanType(59-59)EvaluationModeRedTeam(35-35)EvalFieldProtocol(46-46)EvalFieldJudgeModel(49-49)sdks/python/rogue_sdk/types.py (1)
EvaluationMode(58-62)
packages/tui/internal/tui/keyboard_controller.go (2)
packages/tui/internal/tui/eval_types.go (2)
ProtocolPython(17-17)EvaluationMode(31-31)sdks/python/rogue_sdk/types.py (1)
EvaluationMode(58-62)
examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py (1)
examples/python_entrypoint_stub.py (1)
call_agent(22-85)
sdks/python/rogue_sdk/sdk.py (1)
sdks/python/rogue_sdk/types.py (4)
Scenarios(393-411)AgentConfig(284-348)get_default_transport(82-90)EvaluationRequest(743-770)
rogue/evaluator_agent/base_evaluator_agent.py (1)
sdks/python/rogue_sdk/types.py (1)
is_valid_for_protocol(103-104)
sdks/python/rogue_sdk/types.py (1)
rogue/models/cli_input.py (1)
check_protocol_requirements(68-92)
rogue/evaluator_agent/red_team/base_red_team_attacker_agent.py (2)
rogue/evaluator_agent/red_team/models.py (1)
ChatUpdate(33-40)rogue/evaluator_agent/red_team/python_red_team_attacker_agent.py (1)
_send_message_to_evaluated_agent(160-218)
packages/tui/internal/tui/utils.go (2)
packages/tui/internal/tui/eval_types.go (5)
Protocol(12-12)Transport(20-20)ProtocolPython(17-17)EvaluationMode(31-31)RedTeamConfig(71-78)sdks/python/rogue_sdk/types.py (6)
Protocol(75-90)Transport(93-104)AuthType(23-48)AgentConfig(284-348)EvaluationMode(58-62)RedTeamConfig(151-179)
rogue/run_cli.py (1)
sdks/python/rogue_sdk/types.py (3)
Transport(93-104)Protocol(75-90)AuthType(23-48)
examples/python_entrypoint_stub.py (1)
examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py (1)
call_agent(24-57)
packages/tui/internal/tui/eval_form_view.go (2)
packages/tui/internal/tui/eval_types.go (4)
Protocol(12-12)Transport(20-20)EvaluationMode(31-31)ScanType(62-62)sdks/python/rogue_sdk/types.py (5)
Protocol(75-90)Transport(93-104)Scenarios(393-411)EvaluationMode(58-62)ScanType(143-148)
rogue/evaluator_agent/evaluator_agent_factory.py (2)
rogue/evaluator_agent/python/python_evaluator_agent.py (1)
PythonEvaluatorAgent(20-283)sdks/python/rogue_sdk/types.py (3)
Protocol(75-90)Transport(93-104)Scenarios(393-411)
packages/tui/internal/tui/eval_types.go (2)
sdks/python/rogue_sdk/types.py (2)
Protocol(75-90)Transport(93-104)packages/tui/internal/screens/redteam/catalog.go (4)
GetBasicScanVulnerabilities(387-402)GetBasicScanAttacks(406-414)GetFreeVulnerabilities(417-425)GetFreeAttacks(428-436)
rogue/server/services/red_team_service.py (2)
rogue/server/websocket/manager.py (1)
has_connections(111-116)rogue/server/core/red_team_orchestrator.py (1)
RedTeamOrchestrator(29-164)
packages/tui/internal/screens/evaluations/types.go (2)
packages/tui/internal/tui/eval_types.go (4)
Protocol(12-12)Transport(20-20)EvaluationMode(31-31)ScanType(62-62)sdks/python/rogue_sdk/types.py (4)
Protocol(75-90)Transport(93-104)EvaluationMode(58-62)ScanType(143-148)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: rogue_sanity
🔇 Additional comments (63)
README.md (1)
74-74: All documented Python entrypoint features are correctly implemented.Verification confirms:
- Example file exists at
./examples/python_entrypoint_stub.pywith correct usage documentation- CLI flag
--python-entrypoint-fileis properly implemented inrogue/run_cli.pyand integrated throughout CLI, SDK, and service layers- Function signature
call_agent(messages: list[dict[str, Any]]) -> strmatches implementation in all example files- PYTHON protocol is properly integrated with validation and support across evaluation and red team agents
No issues found. Documentation is accurate and complete.
rogue/evaluator_agent/python/__init__.py (1)
1-5: LGTM!Clean package initialization following Python conventions. The
__all__declaration properly controls the public API.rogue/evaluator_agent/base_evaluator_agent.py (1)
180-184: LGTM!The guard correctly handles the Python protocol case where
transportisNone. This prevents anAttributeErrorfrom callingis_valid_for_protocol()onNonewhile maintaining validation for protocols that do use transports.examples/python_entrypoint_stub.py (1)
1-117: LGTM!Well-documented example file with clear usage instructions and integration examples. The code handles the empty messages case correctly with the ternary operator and follows Python type hint conventions.
rogue/evaluator_agent/python/python_evaluator_agent.py (5)
31-75: LGTM!Constructor properly initializes the agent with Python-specific settings, passing
transport=NoneandProtocol.PYTHONto the base class. The initialization of internal state (_module,_call_agent_fn,_session_histories,_added_sys_path) is well-organized.
104-112: Thread-safety consideration for sys.path modification.Modifying
sys.pathis not thread-safe. If multiplePythonEvaluatorAgentinstances run concurrently with different entrypoint directories, they could interfere with each other's import resolution.For this use case (evaluation tooling, likely single-threaded), this is acceptable. If concurrent evaluation becomes a requirement, consider using
importlib.util.spec_from_file_locationwith explicit submodule handling or process isolation.
147-171: LGTM!Proper async context management with cleanup of
sys.pathand session histories. Thetry/except ValueErrorforsys.path.removehandles the edge case where the path was already removed.
173-258: LGTM!The dual history tracking serves distinct purposes:
_context_id_to_chat_history(inherited) for evaluation logging and_session_historiesfor the call_agent function. The error handling properly catches exceptions and returns a structured error response while logging details.
260-283: LGTM!Clean handling of both synchronous and asynchronous
call_agentfunctions usingasyncio.iscoroutine. The string coercion at line 281 ensures a consistent return type regardless of what the user's function returns.packages/tui/internal/screens/evaluations/types.go (2)
9-10: LGTM!The dual-purpose comment for
FormFieldAgentURLclearly documents its use for both the agent URL and Python entrypoint file path depending on the protocol. This is a pragmatic design that avoids proliferating form field indices.
31-45: LGTM!The
FormStatestruct cleanly separatesAgentURLandPythonEntrypointFileas distinct fields while the editing state comments (lines 43-45) properly document how field navigation changes based on protocol and evaluation mode.rogue/evaluator_agent/red_team/base_red_team_attacker_agent.py (1)
172-173: LGTM!Debug-level logging for yielded updates helps trace the message flow during red team execution without being too verbose.
packages/tui/internal/tui/eval_form_view.go (1)
28-51: LGTM!The addition of
PythonEntrypointFileto theFormStateis consistent with the Python protocol support being introduced. The field mapping is straightforward and correct.rogue/server/services/evaluation_service.py (2)
119-136: LGTM! Python protocol correctly handles missing URL.The change to return an empty string when
evaluated_agent_urlis None aligns with the Python protocol workflow, where no network URL is required (using the entrypoint file instead). The threading ofpython_entrypoint_filethrough to the orchestrator is implemented correctly.
145-149: Remove unnecessary None check on protocol field.The
agent_config.protocolfield is not optional and always has a value (defaults toProtocol.A2A). TheAgentConfigclass validates protocol requirements via@model_validator, ensuring protocol is neverNone. The ternary expressionagent_config.protocol.value if agent_config.protocol else Noneis unnecessary—simplify toagent_config.protocol.value.Likely an incorrect or invalid review comment.
rogue/server/core/evaluation_orchestrator.py (2)
31-61: LGTM! Clean parameter threading.The addition of
python_entrypoint_fileparameter is well-typed and consistently threaded through the orchestrator. The type hintstr | None = Nonecorrectly indicates this is an optional parameter for Python protocol workflows.
82-83: Good defensive coding for protocol logging.Using
self.protocol.value if self.protocol else Noneprevents potentialAttributeErrorif protocol is somehowNone. This is consistent with the defensive coding pattern seen in the evaluation service.packages/tui/internal/screens/evaluations/form_view.go (1)
275-300: No action needed—form navigation correctly handles the field index gap.The navigation logic in
form_controller.goexplicitly detects and skips the Transport field (index 2) when Python protocol is selected, with matching skip conditions in both up/down navigation (lines 23–28 and 49–55). The implementation uses enum-based field identification (EvalFormField) rather than relying on contiguous indices, making the skip logic type-safe and maintainable. The form rendering inform_view.goalso correctly omits Transport for Python protocol. This design is intentional and working as documented.Likely an incorrect or invalid review comment.
rogue/server/core/red_team_orchestrator.py (2)
59-59: LGTM! Python entrypoint support properly threaded through.The
python_entrypoint_fileparameter is correctly added to the constructor, stored as an instance variable, and passed to the attacker agent factory. This enables Python protocol support for red team scans.Also applies to: 79-79, 102-102
144-147: LGTM! Improved chat update visibility.Elevating ChatUpdate logging from debug to info level with structured messages improves observability during red team scans.
rogue/server/services/red_team_service.py (1)
150-150: LGTM! Python entrypoint support and enhanced logging.The
python_entrypoint_fileparameter is correctly passed to the orchestrator, and the chat update logging enhancement improves debugging capabilities.Also applies to: 162-165
rogue/models/cli_input.py (3)
15-19: LGTM! Field types correctly updated for Python protocol support.The field type changes appropriately make
transportandevaluated_agent_urloptional (None for PYTHON protocol) and introduce the newpython_entrypoint_filefield. This aligns with the protocol-specific requirements enforced by the validator below.
67-92: LGTM! Comprehensive protocol-specific validation.The
check_protocol_requirementsvalidator properly enforces protocol-specific constraints:
- For PYTHON protocol: requires
python_entrypoint_fileand validates file existence- For other protocols: requires
evaluated_agent_urlandtransportThe validation runs after auth and red-team validators, which is the correct order. File existence checks prevent runtime errors downstream.
107-107: LGTM! PartialCLIInput updated consistently.The
python_entrypoint_filefield is correctly added toPartialCLIInputwith proper typing and default value.sdks/python/rogue_sdk/sdk.py (3)
242-244: LGTM! API signature correctly updated for Python protocol support.The parameter changes appropriately make
agent_urlandscenariosoptional while addingpython_entrypoint_file. This enables the SDK to support Python protocol evaluations where no agent URL is needed.Also applies to: 257-257
259-282: LGTM! Comprehensive docstring documentation.The docstring thoroughly documents all parameters including the new
python_entrypoint_file, clearly explaining when each parameter is required based on the protocol type. This provides excellent guidance for SDK users.
286-286: LGTM! Proper conditional handling for Python protocol.The conditional logic correctly handles:
- Line 286: Creates
HttpUrlonly whenagent_urlis provided, otherwise None- Line 296: Passes
python_entrypoint_filetoAgentConfig- Line 301: Safely accesses
scenarios.scenarioswith fallback to empty listThe
AgentConfigvalidation will catch any protocol requirement violations, making this implementation safe.Also applies to: 296-296, 301-301
packages/tui/internal/tui/common_controller.go (2)
41-77: LGTM! Protocol-aware paste handling.The paste handling correctly distinguishes between Python protocol (routing to
PythonEntrypointFile) and other protocols (routing toAgentURL). The cursor advancement and config persistence after paste are properly implemented.
250-365: LGTM! Comprehensive NewEvaluationScreen initialization with config loading.The initialization flow properly:
- Loads user config and applies defaults
- Handles Python entrypoint file persistence
- Loads red team config state and applies presets when no selections exist
- Caches the redTeamConfigState for later use
The preset application logic for scan types (basic/full) is well-structured.
sdks/python/rogue_sdk/types.py (3)
80-90: LGTM! Python protocol enum and default transport handling.The
Protocol.PYTHONenum value and theget_default_transport()returningNonefor Python protocol are correctly implemented. The# type: ignorecomment is appropriate sinceNoneis not aTransporttype but is semantically correct for this protocol.
331-344: Validator doesn't check file existence (by design).Unlike the
CLIInput.check_protocol_requirementsvalidator inrogue/models/cli_input.py(lines 67-91), this validator only checks for the presence ofpython_entrypoint_filebut doesn't validate that the file exists or is a valid file. This is acceptable since:
- The SDK types are for API serialization/deserialization
- File existence validation happens at runtime in the evaluator agent classes
This design is consistent with separation of concerns.
836-849: LGTM! RedTeamRequest protocol validation.The validator correctly enforces that
python_entrypoint_fileis required for PYTHON protocol andevaluated_agent_urlis required for non-PYTHON protocols, mirroring the AgentConfig validation logic.packages/tui/internal/tui/form_controller.go (2)
25-28: LGTM! Transport field skipping for Python protocol.The navigation logic correctly skips the Transport field when Python protocol is selected, since Python protocol doesn't use network transport.
Also applies to: 52-55
82-127: LGTM! Config persistence on field changes.The asynchronous
saveUserConfigcalls after protocol, transport, evaluation mode, and scan type changes ensure user preferences are persisted without blocking the UI.rogue/evaluator_agent/evaluator_agent_factory.py (1)
103-113: LGTM! A2A/MCP protocol validation.The validation ensures
evaluated_agent_addressis provided for non-Python protocols, andscenariosdefaults to an emptyScenarios()if not provided.rogue/evaluator_agent/red_team/factory.py (2)
15-36: LGTM! TypedDict for type safety.The
_CommonKwargsTypedDict provides clear documentation of the expected keyword arguments and improves type checking for the factory function.
135-151: LGTM! Protocol-specific agent creation.The factory correctly:
- Validates URL requirement for A2A and MCP protocols
- Validates
python_entrypoint_filerequirement for PYTHON protocol- Passes
python_entrypoint_fileas additional kwarg toPythonRedTeamAttackerAgentNote:
common_kwargsincludesevaluated_agent_addresswhich will beNonefor PYTHON protocol, butPythonRedTeamAttackerAgenthandles this correctly by usingpython_entrypoint_fileas the file path (see lines 52-57 inpython_red_team_attacker_agent.py).rogue/evaluator_agent/red_team/python_red_team_attacker_agent.py (5)
17-84: LGTM! Well-structured constructor with validation.The constructor correctly:
- Accepts either
python_entrypoint_fileor falls back toevaluated_agent_address- Validates that at least one path is provided
- Initializes instance state for module loading and session history tracking
86-142: Dynamic module loading with proper validation.The module loading implementation:
- Validates file existence and type
- Adds parent directory to
sys.pathfor local imports (tracks for cleanup)- Uses
importlib.utilfor safe dynamic loading- Validates
call_agentfunction exists and is callableSecurity note: Dynamic module loading executes arbitrary code from the user-provided file. This is by design for the Python entrypoint feature, but ensure documentation warns users to only use trusted entrypoint files.
144-158: LGTM! Proper async context management.The
__aenter__loads the module and__aexit__properly cleans up:
- Clears session histories
- Removes the added path from
sys.path- Calls parent's
__aexit__for base class cleanup
160-218: Error handling returns string instead of raising exception.The
_send_message_to_evaluated_agentmethod catches all exceptions and returnsf"Error: {str(e)}"instead of propagating the exception. This is intentional for the red team flow where attack attempts should not crash the evaluation, but be aware that:
- Error details are visible in the response
- The red team orchestrator needs to handle these error responses appropriately
220-243: LGTM! Handles both sync and async call_agent.The
_invoke_call_agentmethod correctly handles both synchronous and asynchronouscall_agentimplementations by checkingasyncio.iscoroutine()and awaiting if necessary. Result is coerced to string for consistent return type.packages/tui/internal/tui/utils.go (3)
34-40: LGTM! AgentConfig struct updates for Python protocol.The struct correctly adds:
omitemptytoEvaluatedAgentURLandEvaluatedAgentTransport(optional for Python protocol)- New
PythonEntrypointFilefield withomitemptyThis aligns with the Python SDK's
AgentConfigwhere these fields are optional based on protocol.
161-170: LGTM! Protocol-aware red team request construction.The CreateEvaluation correctly populates either:
python_entrypoint_filefor Python protocolevaluated_agent_url,evaluated_agent_transport, and credentials for other protocolsThis ensures the API request matches the server's expectations based on protocol.
644-670: LGTM! Protocol-specific AgentConfig population.The code correctly:
- Sets
PythonEntrypointFilefor Python protocol (lines 663-665)- Omits transport for Python protocol (line 666 comment)
- Sets
EvaluatedAgentURLandEvaluatedAgentTransportfor other protocols (lines 668-669)This ensures the request payload matches what the server expects for each protocol type.
packages/tui/internal/tui/eval_types.go (10)
14-18: LGTM!Protocol constants are well-aligned with the Python SDK's
Protocolenum values (a2a, mcp, python).
41-48: LGTM!Comments clearly document the dual-use of
EvalFieldAgentURLfor both agent URL and Python entrypoint file based on protocol context.
80-123: LGTM!The
EvaluationViewStatestruct is well-documented with clear comments explaining the form field layout for both Policy and Red Team modes.
131-139: LGTM!The
UserConfigFromFilestruct provides clear JSON serialization tags for config persistence.
282-296: LGTM!The
PythonEntrypointFileis correctly threaded through to theStartEvaluationcall.
346-349: LGTM!
ProtocolPythonis correctly added to the available protocols list.
358-359: LGTM!Correctly returns an empty transport list for Python protocol, aligning with the SDK's behavior where Python doesn't use network transport.
388-404: LGTM!Good state management when cycling protocols - correctly resets transport and cursor position based on the target protocol.
524-530: LGTM!Safe nil checks with a sensible default to
ScanTypeBasic.
532-581: LGTM!The preset application logic correctly handles all scan types:
- Basic and Full apply their respective presets
- Custom preserves user selections
- State is properly synchronized between config objects
rogue/run_cli.py (8)
144-150: LGTM!The CLI argument is well-documented with clear help text explaining when it's required.
164-181: LGTM!Function signature correctly updated with optional types for
transportandpython_entrypoint_file.
214-231: LGTM!Signature changes are consistent with the outer
run_scenariosfunction.
246-263: LGTM!Correctly passes
Noneforagent_urlwhen using Python protocol and converts the entrypoint path to string for SDK consumption.
506-527: LGTM!Clean separation of Python protocol handling from network-based protocols with clear validation error messages.
546-561: LGTM!Correctly handles
agent_urlas optional for Python protocol and passes through the entrypoint file to validation.
571-577: LGTM!Good abstraction to create a consistent identifier across protocols for tracking/logging.
579-596: LGTM!Parameters are correctly threaded through to
run_scenarios.
| # Use hash of conversation history length as session key for consistency | ||
| session_key = len(messages) | ||
| if session_key not in _session_cache: | ||
| _session_cache[session_key] = uuid4().hex | ||
|
|
||
| session_id = _session_cache[session_key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Session key collision risk with parallel conversations.
Using len(messages) as the session key can cause different conversations to share the same session_id if they happen to have the same message count. In parallel evaluation scenarios, this could lead to cross-conversation state pollution.
Consider using a hash of the actual message content or passing a conversation identifier:
🔎 Proposed fix using content-based session key
- # Use hash of conversation history length as session key for consistency
- session_key = len(messages)
+ # Use hash of first message content + length for better session isolation
+ first_msg = messages[0]["content"] if messages else ""
+ session_key = hash((first_msg, len(messages)))
if session_key not in _session_cache:
_session_cache[session_key] = uuid4().hexAlternatively, if the ShirtifyAgent's invoke method doesn't actually need session persistence across calls (since full message history is passed), you could simplify to generate a new session_id per call or derive it deterministically from the full conversation.
🤖 Prompt for AI Agents
In examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py around
lines 44 to 49, using len(messages) as the session key risks collisions across
parallel conversations; replace it with a robust, content-derived or per-call
ID: either compute a deterministic hash (e.g., SHA-256) of the serialized
message contents/ids to derive session_key and store that in _session_cache, or
skip caching entirely and generate a fresh session_id per invocation using
uuid4().hex if persistence across calls is unnecessary; ensure you serialize
messages consistently before hashing (order and fields) and use the hash hex
string as the dictionary key.
| // findUserConfigPath finds the .rogue/user_config.json path, creating it if needed | ||
| func findUserConfigPath() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading comment: function doesn't create the directory.
The comment says "creating it if needed" but the function only finds an existing .rogue directory or returns a default path. It doesn't actually create anything.
🔎 Suggested fix
-// findUserConfigPath finds the .rogue/user_config.json path, creating it if needed
+// findUserConfigPath finds the .rogue/user_config.json path, defaulting to CWD if not found
func findUserConfigPath() string {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // findUserConfigPath finds the .rogue/user_config.json path, creating it if needed | |
| func findUserConfigPath() string { | |
| // findUserConfigPath finds the .rogue/user_config.json path, defaulting to CWD if not found | |
| func findUserConfigPath() string { |
🤖 Prompt for AI Agents
In packages/tui/internal/tui/eval_types.go around lines 162-163, the comment
"finds the .rogue/user_config.json path, creating it if needed" is misleading
because the function does not create the directory or file; update the comment
to accurately state that it only locates/returns the expected path (or returns a
default path) without creating anything, or alternatively modify the function to
create the .rogue directory (use os.MkdirAll with appropriate permissions) and
ensure the user_config.json file exists before returning; pick one approach and
make the code and comment consistent.
| // Marshal and save | ||
| data, err := json.MarshalIndent(existingData, "", " ") | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| return os.WriteFile(configPath, data, 0644) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure parent directory exists before writing.
If the .rogue directory doesn't exist (e.g., fresh project), os.WriteFile will fail. Consider creating the parent directory.
🔎 Suggested fix
// Marshal and save
data, err := json.MarshalIndent(existingData, "", " ")
if err != nil {
return err
}
+ // Ensure parent directory exists
+ if err := os.MkdirAll(filepath.Dir(configPath), 0755); err != nil {
+ return err
+ }
+
return os.WriteFile(configPath, data, 0644)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Marshal and save | |
| data, err := json.MarshalIndent(existingData, "", " ") | |
| if err != nil { | |
| return err | |
| } | |
| return os.WriteFile(configPath, data, 0644) | |
| // Marshal and save | |
| data, err := json.MarshalIndent(existingData, "", " ") | |
| if err != nil { | |
| return err | |
| } | |
| // Ensure parent directory exists | |
| if err := os.MkdirAll(filepath.Dir(configPath), 0755); err != nil { | |
| return err | |
| } | |
| return os.WriteFile(configPath, data, 0644) |
🤖 Prompt for AI Agents
In packages/tui/internal/tui/eval_types.go around lines 208 to 214, the code
calls os.WriteFile(configPath, ...) without ensuring the parent directory
exists, causing failures on fresh projects; before writing, create the parent
directory for configPath (use os.MkdirAll on filepath.Dir(configPath) with
appropriate permissions), handle and return any error from MkdirAll, then
proceed to write the file as before.
| go saveUserConfig( | ||
| m.evalState.AgentProtocol, | ||
| m.evalState.AgentTransport, | ||
| m.evalState.AgentURL, | ||
| m.evalState.PythonEntrypointFile, | ||
| m.evalState.EvaluationMode, | ||
| m.getScanType(), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd packages/tui/internal/tui && pwd && ls -la keyboard_controller.go 2>/dev/null | head -20Repository: qualifire-dev/rogue
Length of output: 180
🏁 Script executed:
#!/bin/bash
# Search for mutex/lock usage around evalState field access
rg -n -C3 'evalState\.(AgentURL|PythonEntrypointFile|cursorPos)' --type=goRepository: qualifire-dev/rogue
Length of output: 23883
🏁 Script executed:
# First, find the evalState struct definition
rg -n 'type.*evalState|struct.*evalState' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 45
🏁 Script executed:
# Search for EvalState struct (capitalized)
rg -n 'type EvalState struct' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 45
🏁 Script executed:
# Look at the model struct and how evalState is defined
rg -n 'evalState' --type=go packages/tui/internal/tui/keyboard_controller.go | head -5Repository: qualifire-dev/rogue
Length of output: 377
🏁 Script executed:
# Get the first part of keyboard_controller.go to see struct definition
head -100 packages/tui/internal/tui/keyboard_controller.goRepository: qualifire-dev/rogue
Length of output: 2725
🏁 Script executed:
# Find EvaluationViewState struct definition
rg -n 'type EvaluationViewState struct' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 139
🏁 Script executed:
# Search for EvaluationViewState in all go files
fd --type f -name "*.go" packages/tui/internal/tui/ -exec grep -l "EvaluationViewState" {} \;Repository: qualifire-dev/rogue
Length of output: 293
🏁 Script executed:
# Look for saveUserConfig function definition
rg -n -A10 'func saveUserConfig' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 1008
🏁 Script executed:
# Read the EvaluationViewState struct and saveUserConfig function
sed -n '81,200p' packages/tui/internal/tui/eval_types.goRepository: qualifire-dev/rogue
Length of output: 4243
🏁 Script executed:
# Also check for any sync.Mutex in the model or related structs
rg -n 'sync\.Mutex|sync\.RWMutex' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 45
🏁 Script executed:
# Get the rest of saveUserConfig function
sed -n '182,230p' packages/tui/internal/tui/eval_types.goRepository: qualifire-dev/rogue
Length of output: 1750
🏁 Script executed:
# Check the Model type and its Update method
rg -n 'type Model struct' --type=go packages/tui/internal/tui/ | head -5Repository: qualifire-dev/rogue
Length of output: 120
🏁 Script executed:
# Check if bubble tea is being used and what the Update pattern looks like
rg -n 'func \(m Model\) Update' --type=go packages/tui/internal/tui/ | head -10Repository: qualifire-dev/rogue
Length of output: 156
🏁 Script executed:
# Read the Model struct definition and Update method
sed -n '61,150p' packages/tui/internal/tui/types.goRepository: qualifire-dev/rogue
Length of output: 2568
🏁 Script executed:
# Check the Update method and how it routes events
sed -n '92,120p' packages/tui/internal/tui/app.goRepository: qualifire-dev/rogue
Length of output: 760
🏁 Script executed:
# Look for where saveUserConfig is called and check the context
rg -n 'go saveUserConfig' --type=go packages/tui/internal/tui/ | wc -lRepository: qualifire-dev/rogue
Length of output: 65
🏁 Script executed:
# Check how many times saveUserConfig is spawned in keyboard_controller.go
rg -n 'go saveUserConfig' packages/tui/internal/tui/keyboard_controller.goRepository: qualifire-dev/rogue
Length of output: 88
🏁 Script executed:
# Check the full keyboard_controller.go Update method to understand the flow
sed -n '155,200p' packages/tui/internal/tui/keyboard_controller.goRepository: qualifire-dev/rogue
Length of output: 2056
🏁 Script executed:
# Check if there's any debouncing or synchronization mechanism elsewhere
rg -n 'debounce|time\.After|Debounce' --type=go packages/tui/internal/tui/Repository: qualifire-dev/rogue
Length of output: 45
Add file-level synchronization or debounce saveUserConfig calls.
Multiple saveUserConfig goroutines can execute concurrently when keys are pressed rapidly. Since each performs an unprotected read-modify-write on user_config.json, concurrent writes risk configuration loss. Either protect file access with a mutex, use a single-writer pattern, or implement debouncing to ensure only the latest changes are saved.
🤖 Prompt for AI Agents
In packages/tui/internal/tui/keyboard_controller.go around lines 180 to 187,
multiple goroutines call saveUserConfig concurrently which can cause race
conditions and lost writes; protect file access by serializing saves: either add
a package-level sync.Mutex that saveUserConfig locks/unlocks around the
read-modify-write of user_config.json, or implement a single-writer goroutine
pattern (publish config changes to a buffered channel and have one goroutine
perform writes), or add a short debounce (coalesce rapid calls into one write).
Ensure whichever approach you pick actually serializes file I/O and handles
errors, and remove the fire-and-forget goroutine calls in favor of the
serialized/debounced mechanism.
| judge_llm_aws_secret_access_key: Optional[str] = None, | ||
| judge_llm_aws_region: Optional[str] = None, | ||
| evaluation_mode: EvaluationMode = EvaluationMode.POLICY, | ||
| python_entrypoint_file: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document the new python_entrypoint_file parameter in the docstring.
The python_entrypoint_file parameter was added to the function signature but is not documented in the Args section of the docstring (lines 113-128). This makes the API harder to use and understand.
🔎 Proposed fix
Add the parameter documentation in the Args section around line 127:
judge_llm_aws_region: AWS region for judge LLM
evaluation_mode: Should be POLICY for this function
+ python_entrypoint_file: Path to Python file with call_agent function (for PYTHON protocol)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In rogue/evaluator_agent/run_evaluator_agent.py around line 104 (docstring spans
~113-128), the new python_entrypoint_file parameter is missing from the Args
section; update the function docstring by adding a concise Args entry for
python_entrypoint_file describing its type (Optional[str]), purpose (path to a
Python entrypoint file to run or None), and default/behavior when None, placing
it near line 127 so the Args list remains ordered and complete.
| # Check for call_agent function | ||
| has_call_agent = False | ||
| for node in ast.walk(tree): | ||
| if isinstance(node, ast.FunctionDef) and node.name == "call_agent": | ||
| has_call_agent = True | ||
| break | ||
|
|
||
| if not has_call_agent: | ||
| raise ValueError( | ||
| f"Python entrypoint file must contain a 'call_agent' function: {python_file}", # noqa: E501 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async function definitions are not detected.
The AST check only looks for ast.FunctionDef, but if the user defines async def call_agent(...), it won't be found (it would be ast.AsyncFunctionDef).
🔎 Suggested fix
# Check for call_agent function
has_call_agent = False
for node in ast.walk(tree):
- if isinstance(node, ast.FunctionDef) and node.name == "call_agent":
+ if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)) and node.name == "call_agent":
has_call_agent = True
break📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Check for call_agent function | |
| has_call_agent = False | |
| for node in ast.walk(tree): | |
| if isinstance(node, ast.FunctionDef) and node.name == "call_agent": | |
| has_call_agent = True | |
| break | |
| if not has_call_agent: | |
| raise ValueError( | |
| f"Python entrypoint file must contain a 'call_agent' function: {python_file}", # noqa: E501 | |
| ) | |
| # Check for call_agent function | |
| has_call_agent = False | |
| for node in ast.walk(tree): | |
| if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)) and node.name == "call_agent": | |
| has_call_agent = True | |
| break | |
| if not has_call_agent: | |
| raise ValueError( | |
| f"Python entrypoint file must contain a 'call_agent' function: {python_file}", # noqa: E501 | |
| ) |
🤖 Prompt for AI Agents
In rogue/run_cli.py around lines 488 to 498, the AST check only looks for
ast.FunctionDef so async functions like "async def call_agent"
(ast.AsyncFunctionDef) aren't detected; update the check to consider both
ast.FunctionDef and ast.AsyncFunctionDef (e.g., isinstance(node,
(ast.FunctionDef, ast.AsyncFunctionDef))) so synchronous and asynchronous
call_agent definitions are recognized, leaving the rest of the logic (setting
has_call_agent and raising the ValueError) unchanged.
| evaluated_agent_url=( | ||
| str(job.request.evaluated_agent_url) | ||
| if job.request.evaluated_agent_url | ||
| else "" | ||
| ), # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for evaluated_agent_url usage across red team components
echo "=== Searching for evaluated_agent_url in red_team directory ==="
rg -nP --type=py 'evaluated_agent_url' rogue/evaluator_agent/red_team/ 2>/dev/null | head -50
echo -e "\n=== Searching for create_red_team_attacker_agent function ==="
rg -nP --type=py 'def create_red_team_attacker_agent' rogue/ 2>/dev/null
echo -e "\n=== Searching for RedTeamOrchestrator class and initialization ==="
rg -nP --type=py 'class RedTeamOrchestrator' rogue/ 2>/dev/null
echo -e "\n=== Looking for evaluated_agent_url in red_team_orchestrator.py ==="
rg -nP --type=py 'evaluated_agent_url' rogue/server/core/red_team_orchestrator.py 2>/dev/null | head -30Repository: qualifire-dev/rogue
Length of output: 1544
🏁 Script executed:
#!/bin/bash
# Look for Python red team agent implementations
echo "=== Finding python_red_team files ==="
fd -t f 'python.*red.*team.*agent' rogue/ 2>/dev/null
echo -e "\n=== Searching for attacker agent factory ==="
fd -t f '.*attacker.*agent.*factory' rogue/ 2>/dev/null
echo -e "\n=== Checking red team directory structure ==="
ls -la rogue/evaluator_agent/red_team/ 2>/dev/null | head -20Repository: qualifire-dev/rogue
Length of output: 883
🏁 Script executed:
#!/bin/bash
# Check how evaluated_agent_url is validated or used in orchestrator
echo "=== Checking RedTeamOrchestrator signature and usage ==="
rg -nP --type=py 'RedTeamOrchestrator\(' rogue/server/services/red_team_service.py -A5 -B5
echo -e "\n=== Checking how evaluated_agent_url flows through orchestrator ==="
rg -nP --type=py 'create_red_team_attacker_agent' rogue/server/core/red_team_orchestrator.py -A10 -B2Repository: qualifire-dev/rogue
Length of output: 1735
🏁 Script executed:
#!/bin/bash
# Read factory.py to see complete validation logic
cat -n rogue/evaluator_agent/red_team/factory.pyRepository: qualifire-dev/rogue
Length of output: 7682
🏁 Script executed:
#!/bin/bash
# Check python_red_team_attacker_agent.py for evaluated_agent_url usage
cat -n rogue/evaluator_agent/red_team/python_red_team_attacker_agent.py | head -100Repository: qualifire-dev/rogue
Length of output: 4891
🏁 Script executed:
#!/bin/bash
# Check what protocol values are being passed and if Python protocol is used
rg -nP --type=py 'evaluated_agent_protocol' rogue/server/services/red_team_service.py -B5 -A5Repository: qualifire-dev/rogue
Length of output: 580
Pass None instead of an empty string for evaluated_agent_url when falsy.
The code converts falsy job.request.evaluated_agent_url to an empty string, but the factory's validation logic uses truthiness checks (if not evaluated_agent_url:) for A2A and MCP protocols. An empty string will trigger the same ValueError as None, making the conversion semantically confusing. For clarity and consistency with the API's Optional[str] type signature, pass None directly when the value is falsy rather than converting it to an empty string.
🤖 Prompt for AI Agents
In rogue/server/services/red_team_service.py around lines 130 to 134, the code
coerces a falsy job.request.evaluated_agent_url to an empty string before
passing it to the factory, but the factory expects an Optional[str] and checks
truthiness (treating "" like None). Change the call to pass None when
evaluated_agent_url is falsy (i.e., use job.request.evaluated_agent_url if
truthy, otherwise None) so the value remains an actual None for clarity and
correct semantics.
examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @rogue/evaluator_agent/python/python_evaluator_agent.py:
- Line 105: Remove the redundant local "import sys" inside
python_evaluator_agent.py (the duplicate import at the location shown) since sys
is already imported at the module top; delete that line and ensure imports
follow isort-style grouping (standard, third-party, local) so the file has a
single top-level import for sys and no in-function/module-body duplicates.
🧹 Nitpick comments (4)
rogue/evaluator_agent/python/python_evaluator_agent.py (4)
128-140: Consider validating the call_agent signature.While the code checks that
call_agentis callable, it doesn't validate the expected signaturecall_agent(messages: list[dict]) -> str. Usinginspect.signature()could provide earlier, clearer error messages if the user's function has an incompatible signature.🔎 Example signature validation
import inspect # After line 135 sig = inspect.signature(self._call_agent_fn) params = list(sig.parameters.values()) if len(params) != 1: raise TypeError( f"call_agent must accept exactly 1 parameter (messages), " f"but has {len(params)} parameters" )
146-149: Consider potential event loop blocking during module load.
_load_python_module()performs synchronous file I/O and module loading, which could briefly block the event loop. For typical use cases this is likely acceptable since initialization happens once, but if loading large modules or working with slow file systems, consider whether this warrants investigation.
204-212: Clarify the fallback logic.The fallback to
[{"role": "user", "content": message}]appears unnecessary since_add_message_to_chat_historywas just called on line 201, which creates theChatHistoryif it doesn't exist. While this defensive programming doesn't cause issues, it may indicate uncertainty about the state management flow.
246-269: Consider refining the type hint for _call_agent_fn.The current type hint
Optional[Callable[[list[dict[str, Any]]], str]]doesn't capture that the function can also be async. While the runtime handling is correct, a more precise type hint would be:from typing import Union, Coroutine _call_agent_fn: Optional[ Union[ Callable[[list[dict[str, Any]]], str], Callable[[list[dict[str, Any]]], Coroutine[Any, Any, str]] ] ]However, this may be overly complex. The current implementation handles both cases correctly at runtime.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
rogue/evaluator_agent/python/python_evaluator_agent.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Useblackfor code formatting
Use type hints for all function signatures
Follow PEP 8 naming conventions (snake_case for variables and functions, PascalCase for classes)
Use try/except blocks for code that may raise exceptions
Useisortconventions for import organization
Files:
rogue/evaluator_agent/python/python_evaluator_agent.py
🧬 Code graph analysis (1)
rogue/evaluator_agent/python/python_evaluator_agent.py (3)
sdks/python/rogue_sdk/types.py (1)
Protocol(75-90)rogue/evaluator_agent/base_evaluator_agent.py (2)
_send_message_to_evaluated_agent(555-572)_add_message_to_chat_history(583-617)rogue/evaluator_agent/red_team/python_red_team_attacker_agent.py (2)
_load_python_module(86-142)_send_message_to_evaluated_agent(160-218)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: codestyle
- GitHub Check: rogue_sanity
🔇 Additional comments (5)
rogue/evaluator_agent/python/python_evaluator_agent.py (5)
1-19: LGTM - Module structure and imports are well-organized.The module docstring clearly explains the purpose, and imports are properly organized following conventions (stdlib → external → local).
21-30: LGTM - Clear class documentation.The docstring effectively communicates the expected
call_agentfunction signature and message format.
32-74: LGTM - Initialization is well-structured.The constructor properly initializes the agent, converts the file path to a
Pathobject for robust path handling, and tracks necessary state variables for module loading and cleanup.
76-144: Verify security implications of arbitrary code execution.This method loads and executes user-provided Python code without sandboxing, timeouts, or resource limits. The loaded module runs with full privileges of the rogue process and can access files, network, environment variables, etc.
Confirm this security model is acceptable for the intended use case. Consider documenting security expectations in the module docstring or user-facing documentation.
151-167: LGTM - Proper cleanup implementation.The exit handler correctly cleans up the
sys.pathmodification and handles edge cases gracefully with the try-except block.
|
|
||
| # Add the entrypoint file's directory to sys.path so imports work | ||
| # This allows the entrypoint to import sibling modules | ||
| import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove redundant import statement.
sys is already imported at the top of the file (line 9). This local import is unnecessary and violates DRY principle.
🔎 Proposed fix
- import sys
-
entrypoint_dir = str(self._python_file_path.parent.resolve())Based on coding guidelines, use isort conventions for import organization.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import sys | |
| entrypoint_dir = str(self._python_file_path.parent.resolve()) |
🤖 Prompt for AI Agents
In @rogue/evaluator_agent/python/python_evaluator_agent.py around line 105,
Remove the redundant local "import sys" inside python_evaluator_agent.py (the
duplicate import at the location shown) since sys is already imported at the
module top; delete that line and ensure imports follow isort-style grouping
(standard, third-party, local) so the file has a single top-level import for sys
and no in-function/module-body duplicates.
| async def _send_message_to_evaluated_agent( | ||
| self, | ||
| context_id: str, | ||
| message: str, | ||
| ) -> dict[str, str]: | ||
| """ | ||
| Sends a message to the evaluated agent by calling the Python function. | ||
|
|
||
| :param message: the text to send to the agent. | ||
| :param context_id: The context ID of the conversation. | ||
| Each conversation has a unique context_id. All messages in the conversation | ||
| have the same context_id. | ||
| :return: A dictionary containing the response from the evaluated agent. | ||
| - "response": the response string. If there is no response | ||
| from the agent, the string is empty. | ||
| """ | ||
| if self._call_agent_fn is None: | ||
| raise RuntimeError( | ||
| "Python module not loaded. Call __aenter__ first or use async with.", | ||
| ) | ||
|
|
||
| try: | ||
| logger.info( | ||
| "🐍 Calling Python entrypoint function", | ||
| extra={ | ||
| "message": message[:100] + "..." if len(message) > 100 else message, | ||
| "context_id": context_id, | ||
| "python_file": str(self._python_file_path), | ||
| }, | ||
| ) | ||
|
|
||
| # Add the user message to chat history | ||
| self._add_message_to_chat_history(context_id, "user", message) | ||
|
|
||
| # Build messages list from chat history (single source of truth) | ||
| chat_history = self._context_id_to_chat_history.get(context_id) | ||
| messages = ( | ||
| [ | ||
| {"role": msg.role, "content": msg.content} | ||
| for msg in chat_history.messages | ||
| ] | ||
| if chat_history | ||
| else [{"role": "user", "content": message}] | ||
| ) | ||
|
|
||
| # Call the user's function | ||
| response = await self._invoke_call_agent(messages) | ||
|
|
||
| # Add the assistant response to chat history | ||
| self._add_message_to_chat_history(context_id, "assistant", response) | ||
|
|
||
| logger.info( | ||
| "✅ Python entrypoint call successful", | ||
| extra={ | ||
| "response_length": len(response), | ||
| "response_preview": ( | ||
| response[:100] + "..." if len(response) > 100 else response | ||
| ), | ||
| "context_id": context_id, | ||
| }, | ||
| ) | ||
|
|
||
| return {"response": response} | ||
|
|
||
| except Exception as e: | ||
| logger.exception( | ||
| "❌ Python entrypoint call failed", | ||
| extra={ | ||
| "message": message[:100] + "..." if len(message) > 100 else message, | ||
| "context_id": context_id, | ||
| "python_file": str(self._python_file_path), | ||
| "error": str(e), | ||
| "error_type": type(e).__name__, | ||
| }, | ||
| ) | ||
| return {"response": "", "error": str(e)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout protection for user code execution.
There's no timeout when calling _invoke_call_agent (line 215), so a runaway or hanging user function could block indefinitely. Consider wrapping the invocation with asyncio.wait_for() to enforce a reasonable timeout.
🔎 Proposed fix with configurable timeout
+ # Call the user's function with timeout protection
+ timeout_seconds = 300 # 5 minutes default, consider making configurable
- # Call the user's function
- response = await self._invoke_call_agent(messages)
+ try:
+ response = await asyncio.wait_for(
+ self._invoke_call_agent(messages),
+ timeout=timeout_seconds
+ )
+ except asyncio.TimeoutError:
+ logger.error(
+ f"Python entrypoint call timed out after {timeout_seconds}s",
+ extra={"context_id": context_id}
+ )
+ return {"response": "", "error": f"Call timed out after {timeout_seconds}s"}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Fix all issues with AI Agents 🤖
In @examples/python_entrypoint_stub.py:
- Line 115: The line setting latest_message assumes messages[-1] has a "content"
key and can raise KeyError; update the assignment for latest_message to
defensively access the key by checking that messages is non-empty and that
"content" exists on messages[-1] (e.g., use dict.get or an "in" check) and fall
back to an empty string if missing so latest_message never raises.
- Around line 54-55: The current extraction of latest_message assumes
messages[-1] is a dict with a "content" key and can raise KeyError; update the
logic around latest_message to defensively access the key (e.g., check messages
is non-empty, ensure messages[-1] is a dict, and use dict.get("content", "") or
fallback to an empty string) so malformed entries won’t raise and the template
demonstrates safe access to the "content" field.
- Around line 1-15: The module docstring is missing the primary usage
instructions before the "Or via TUI:" section and contains an extra blank line;
update the top-level docstring (the usage block describing how to run this
example) to include a primary invocation example (e.g., a CLI command or minimal
Python snippet showing how to import and call call_agent) placed before the "Or
via TUI:" paragraph, remove the stray blank line, and ensure it reiterates that
the call_agent function receives the full conversation history and must return a
string.
In @packages/tui/internal/tui/common_controller.go:
- Around line 63-70: The saveUserConfig call is currently launched in a
goroutine (go saveUserConfig(...)) so any returned error is ignored; change this
to either call saveUserConfig synchronously and check its returned error, or
keep it async but capture errors via a channel and dispatch them to the UI
(e.g., call a controller method to show a dialog/status update). Locate the
invocation in common_controller.go and replace the fire-and-forget goroutine
with a direct call that handles the error or add an error channel and a
goroutine that reads the error and invokes your existing UI error/display method
to notify the user when persistence fails.
In @packages/tui/internal/tui/eval_types.go:
- Around line 189-190: The code currently ignores errors from json.Unmarshal
when reading an existing config (the os.ReadFile block that reads configPath
into b and unmarshals into existingData), which can silently overwrite user
data; update that block to capture and handle the unmarshal error: after
json.Unmarshal(b, &existingData) check the returned error, log or return the
error (or bail out from the save/merge operation) instead of proceeding with an
empty map so corrupted/invalid JSON is not silently overwritten. Ensure you
reference the same variables (b, err, existingData, configPath) and keep
existing behavior for successful unmarshalling.
♻️ Duplicate comments (4)
rogue/evaluator_agent/python/python_evaluator_agent.py (2)
107-107: Remove redundant local import.The
import sysstatement at line 107 is redundant sincesysis already imported at the top of the file (line 10). This violates the DRY principle.Based on coding guidelines, use isort conventions for import organization with all imports at the module top.
🔎 Proposed fix
- import sys - entrypoint_dir = str(self._python_file_path.parent.resolve())
216-217: Consider adding timeout protection for user code execution.The call to
_invoke_call_agentat line 217 has no timeout, so a hanging or runaway user function could block indefinitely. Consider wrapping the invocation withasyncio.wait_for()to enforce a reasonable timeout.This was previously flagged in past reviews and remains unaddressed.
packages/tui/internal/tui/eval_types.go (2)
164-181: Misleading comment: function doesn't create the directory.The comment says "creating it if needed" but the function only finds an existing
.roguedirectory or returns a default path. It doesn't actually create anything.🔎 Suggested fix
-// findUserConfigPath finds the .rogue/user_config.json path, creating it if needed +// findUserConfigPath finds the .rogue/user_config.json path, defaulting to CWD if not found func findUserConfigPath() string {
210-216: Ensure parent directory exists before writing.If the
.roguedirectory doesn't exist (e.g., fresh project),os.WriteFilewill fail. Consider creating the parent directory.🔎 Suggested fix
// Marshal and save data, err := json.MarshalIndent(existingData, "", " ") if err != nil { return err } + // Ensure parent directory exists + if err := os.MkdirAll(filepath.Dir(configPath), 0755); err != nil { + return err + } + return os.WriteFile(configPath, data, 0644)
🧹 Nitpick comments (2)
rogue/evaluator_agent/python/python_evaluator_agent.py (1)
265-265: Move import to the top of the file.The
import inspectstatement should be moved to the module-level imports at the top of the file (along with other standard library imports) rather than being imported locally within the function.Based on coding guidelines, use isort conventions for import organization.
🔎 Proposed fix
At the top of the file, add
inspectto the standard library imports:import asyncio import importlib.util +import inspect import sys from pathlib import PathThen remove the local import:
- # Try to call with context_id first (new signature), - # fall back to messages-only for backward compatibility - import inspect - sig = inspect.signature(self._call_agent_fn)examples/python_entrypoint_stub.py (1)
110-110: Consider moving import to module level.The
asyncioimport is placed inside the function. While functional, Python convention recommends placing imports at the module level for better readability and consistency.🔎 Suggested fix
from typing import Any, Optional +import asyncio def call_agent(Then remove the import from inside the function:
Args: messages: List of message dicts with 'role' and 'content' keys. context_id: Optional unique conversation ID for session tracking. """ - import asyncio - # Simulate async processing await asyncio.sleep(0.1)
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
examples/python_entrypoint_stub.pyexamples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.pypackages/tui/internal/tui/common_controller.gopackages/tui/internal/tui/eval_types.gorogue/evaluator_agent/python/python_evaluator_agent.py
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Useblackfor code formatting
Use type hints for all function signatures
Follow PEP 8 naming conventions (snake_case for variables and functions, PascalCase for classes)
Use try/except blocks for code that may raise exceptions
Useisortconventions for import organization
Files:
examples/python_entrypoint_stub.pyrogue/evaluator_agent/python/python_evaluator_agent.py
🧠 Learnings (1)
📚 Learning: 2026-01-04T16:10:56.611Z
Learnt from: CR
Repo: qualifire-dev/rogue PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-04T16:10:56.611Z
Learning: Project structure should organize code into `rogue/` main package, `tests/` for tests, `examples/` for example implementations, and `.github/` for CI/CD workflows
Applied to files:
packages/tui/internal/tui/eval_types.go
🧬 Code graph analysis (2)
examples/python_entrypoint_stub.py (1)
examples/tshirt_store_langgraph_agent/python_entrypoint_shirtify.py (1)
call_agent(20-54)
packages/tui/internal/tui/common_controller.go (2)
packages/tui/internal/tui/eval_types.go (3)
ProtocolPython(19-19)EvaluationMode(33-33)LoadEvaluationStateFromConfig(273-401)sdks/python/rogue_sdk/types.py (1)
EvaluationMode(58-62)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: rogue_sanity
- GitHub Check: codestyle
🔇 Additional comments (14)
rogue/evaluator_agent/python/python_evaluator_agent.py (6)
1-19: LGTM! Well-organized imports and clear module documentation.The imports follow isort conventions correctly (standard library, third-party, local), and the module docstring clearly explains the purpose.
21-30: LGTM! Clear class documentation.The class structure and docstring clearly communicate the purpose and expected function signature.
32-76: LGTM! Well-structured initialization.The
__init__method follows PEP 8 naming conventions, includes comprehensive type hints, and properly initializes all instance variables. Good use ofPathfor file path handling and informative debug logging.
148-151: LGTM! Proper async context manager entry.The
__aenter__method correctly loads the Python module and delegates to the parent class.
153-169: LGTM! Proper cleanup in async context exit.The
__aexit__method correctly cleans up thesys.pathmodification with defensive error handling to avoid issues if the path was already removed.
267-291: LGTM! Excellent backward compatibility and type handling.The signature inspection logic elegantly handles both old (messages-only) and new (messages + context_id) function signatures with appropriate fallback. The async detection and type coercion ensure robust handling of different implementations.
packages/tui/internal/tui/eval_types.go (6)
17-19: LGTM: Clean protocol constant addition.The addition of
ProtocolPythonfollows the existing pattern and integrates well with the protocol cycling and transport logic throughout the file.
44-58: LGTM: Clear field constant documentation.The updated comments and field constant structure clearly document the dual use of
EvalFieldAgentURLfor both Agent URL and Python entrypoint file, along with the protocol-specific field skipping behavior.
483-497: LGTM: Protocol-transport mapping correctly handles Python.The updated
getAllProtocolsandgetTransportsForProtocolfunctions correctly include Python protocol and return an empty transport list for it, since Python protocol doesn't use network transports.
533-539: LGTM: Cursor position correctly reset on protocol change.When the protocol changes, the cursor position is properly reset to match the appropriate text field (PythonEntrypointFile or AgentURL), preventing cursor position bugs.
659-716: LGTM: Preset application logic is sound.The
applyPresetForScanTypefunction correctly handles preset application for Basic and Full scan types while preserving custom selections for Custom scan type.
275-275: No action required—openai/gpt-4.1is a valid OpenAI model identifier.OpenAI introduced the GPT-4.1 model series in April 2025, which includes
gpt-4.1as a valid model identifier. The code's fallback default is correct.Likely an incorrect or invalid review comment.
packages/tui/internal/tui/common_controller.go (2)
41-78: LGTM: Protocol-aware paste handling works correctly.The paste logic correctly differentiates between Python entrypoint file and Agent URL based on the protocol, with appropriate cursor updates and config persistence.
238-239: LGTM: Clean config consolidation.Replacing hard-coded defaults with
LoadEvaluationStateFromConfigproperly consolidates config loading fromuser_config.json,scenarios.json, andredteam.yaml.
| """ | ||
| Example Python entrypoint for Rogue agent evaluation. | ||
|
|
||
| This file demonstrates how to create a Python entrypoint for testing | ||
| agents without A2A or MCP protocols. Rogue will dynamically import | ||
| this file and call the `call_agent` function. | ||
|
|
||
|
|
||
| Or via TUI: | ||
| 1. Select "Python" as the protocol | ||
| 2. Enter the path to your Python file | ||
|
|
||
| The call_agent function receives the full conversation history and | ||
| should return the agent's response as a string. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete usage documentation.
The docstring has a blank line at Line 8 and Line 9 begins with "Or via TUI:", suggesting missing usage instructions for a primary method (e.g., CLI command) before the TUI alternative.
🔎 Suggested fix
This file demonstrates how to create a Python entrypoint for testing
agents without A2A or MCP protocols. Rogue will dynamically import
this file and call the `call_agent` function.
+Usage:
+ Via CLI:
+ rogue evaluate --protocol python --python-file path/to/this_file.py
Or via TUI:
1. Select "Python" as the protocol
2. Enter the path to your Python file🤖 Prompt for AI Agents
In @examples/python_entrypoint_stub.py around lines 1-15, The module docstring
is missing the primary usage instructions before the "Or via TUI:" section and
contains an extra blank line; update the top-level docstring (the usage block
describing how to run this example) to include a primary invocation example
(e.g., a CLI command or minimal Python snippet showing how to import and call
call_agent) placed before the "Or via TUI:" paragraph, remove the stray blank
line, and ensure it reiterates that the call_agent function receives the full
conversation history and must return a string.
| # Extract the latest user message | ||
| latest_message = messages[-1]["content"] if messages else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive key access to prevent KeyError.
The code assumes messages[-1] contains a "content" key. If the message structure is malformed, this will raise a KeyError. Since this is a template file, consider demonstrating defensive coding practices.
🔎 Suggested fix
- # Extract the latest user message
- latest_message = messages[-1]["content"] if messages else ""
+ # Extract the latest user message
+ latest_message = messages[-1].get("content", "") if messages else ""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Extract the latest user message | |
| latest_message = messages[-1]["content"] if messages else "" | |
| # Extract the latest user message | |
| latest_message = messages[-1].get("content", "") if messages else "" |
🤖 Prompt for AI Agents
In @examples/python_entrypoint_stub.py around lines 54-55, The current
extraction of latest_message assumes messages[-1] is a dict with a "content" key
and can raise KeyError; update the logic around latest_message to defensively
access the key (e.g., check messages is non-empty, ensure messages[-1] is a
dict, and use dict.get("content", "") or fallback to an empty string) so
malformed entries won’t raise and the template demonstrates safe access to the
"content" field.
| # Simulate async processing | ||
| await asyncio.sleep(0.1) | ||
|
|
||
| latest_message = messages[-1]["content"] if messages else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive key access to prevent KeyError.
Same issue as the sync version: the code assumes messages[-1] contains a "content" key, which may raise a KeyError if the structure is unexpected.
🔎 Suggested fix
- latest_message = messages[-1]["content"] if messages else ""
+ latest_message = messages[-1].get("content", "") if messages else ""🤖 Prompt for AI Agents
In @examples/python_entrypoint_stub.py around line 115, The line setting
latest_message assumes messages[-1] has a "content" key and can raise KeyError;
update the assignment for latest_message to defensively access the key by
checking that messages is non-empty and that "content" exists on messages[-1]
(e.g., use dict.get or an "in" check) and fall back to an empty string if
missing so latest_message never raises.
| go saveUserConfig( | ||
| m.evalState.AgentProtocol, | ||
| m.evalState.AgentTransport, | ||
| m.evalState.AgentURL, | ||
| m.evalState.PythonEntrypointFile, | ||
| m.evalState.EvaluationMode, | ||
| m.getScanType(), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent failure: saveUserConfig errors are not handled.
The saveUserConfig call runs in a goroutine without error handling. If config persistence fails (e.g., permission denied, disk full), the user won't be notified and their settings won't be saved.
Consider either:
- Removing the goroutine if the operation is fast enough (file I/O is typically quick)
- Adding error handling that notifies the user via a dialog or status message
🔎 Option 1: Synchronous save with error handling
m.evalState.cursorPos += len([]rune(cleanText))
- // Save config after paste
- go saveUserConfig(
- m.evalState.AgentProtocol,
- m.evalState.AgentTransport,
- m.evalState.AgentURL,
- m.evalState.PythonEntrypointFile,
- m.evalState.EvaluationMode,
- m.getScanType(),
- )
+ // Save config after paste
+ if err := saveUserConfig(
+ m.evalState.AgentProtocol,
+ m.evalState.AgentTransport,
+ m.evalState.AgentURL,
+ m.evalState.PythonEntrypointFile,
+ m.evalState.EvaluationMode,
+ m.getScanType(),
+ ); err != nil {
+ // Optionally notify user of save failure
+ // (may choose to silently ignore if non-critical)
+ }🤖 Prompt for AI Agents
In @packages/tui/internal/tui/common_controller.go around lines 63-70, The
saveUserConfig call is currently launched in a goroutine (go
saveUserConfig(...)) so any returned error is ignored; change this to either
call saveUserConfig synchronously and check its returned error, or keep it async
but capture errors via a channel and dispatch them to the UI (e.g., call a
controller method to show a dialog/status update). Locate the invocation in
common_controller.go and replace the fire-and-forget goroutine with a direct
call that handles the error or add an error channel and a goroutine that reads
the error and invokes your existing UI error/display method to notify the user
when persistence fails.
| if b, err := os.ReadFile(configPath); err == nil { | ||
| json.Unmarshal(b, &existingData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle JSON unmarshal errors to avoid silent data corruption.
When reading existing config, json.Unmarshal errors are ignored. If the existing config file is corrupted or invalid JSON, this could lead to silently overwriting user data with an empty map.
🔎 Proposed fix
// Read existing config to preserve other fields
existingData := make(map[string]interface{})
if b, err := os.ReadFile(configPath); err == nil {
- json.Unmarshal(b, &existingData)
+ if err := json.Unmarshal(b, &existingData); err != nil {
+ // Log or return error instead of silently ignoring corrupted config
+ return fmt.Errorf("failed to parse existing config: %w", err)
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if b, err := os.ReadFile(configPath); err == nil { | |
| json.Unmarshal(b, &existingData) | |
| // Read existing config to preserve other fields | |
| existingData := make(map[string]interface{}) | |
| if b, err := os.ReadFile(configPath); err == nil { | |
| if err := json.Unmarshal(b, &existingData); err != nil { | |
| // Log or return error instead of silently ignoring corrupted config | |
| return fmt.Errorf("failed to parse existing config: %w", err) | |
| } | |
| } |
🤖 Prompt for AI Agents
In @packages/tui/internal/tui/eval_types.go around lines 189-190, The code
currently ignores errors from json.Unmarshal when reading an existing config
(the os.ReadFile block that reads configPath into b and unmarshals into
existingData), which can silently overwrite user data; update that block to
capture and handle the unmarshal error: after json.Unmarshal(b, &existingData)
check the returned error, log or return the error (or bail out from the
save/merge operation) instead of proceeding with an empty map so
corrupted/invalid JSON is not silently overwritten. Ensure you reference the
same variables (b, err, existingData, configPath) and keep existing behavior for
successful unmarshalling.
Description
Motivation and Context
Type of Change
Changes Made
Screenshots/Examples (if applicable)
Checklist
uv run black .to format my codeuv run flake8 .and fixed all issuesuv run mypy --config-file .mypy.ini .and addressed type checking issuesuv run bandit -c .bandit.yaml -r .for security checksuv run pytestand all tests passTesting
Test Configuration:
Test Steps:
1.
2.
3.
Additional Notes
Related Issues/PRs