-
Notifications
You must be signed in to change notification settings - Fork 65
RSPEED-2229: Add rlsapi v1 endpoint registration and CLA documentation #928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @major. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughRegisters a new rlsapi v1 router at /v1, adds an RHEL Lightspeed CLA YAML example, and introduces integration tests covering the rlsapi v1 /infer endpoint with mocked Llama Stack and authorization behavior. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/integration/endpoints/test_rlsapi_v1_integration.py (1)
69-113: Consider breaking down _create_mock_agent for improved clarity.The helper function works correctly but handles multiple responsibilities (creating client, agent, turn, output message, and patching multiple modules). While this is acceptable for test code, consider extracting smaller helpers if this pattern is reused elsewhere.
For example, you could extract:
_create_mock_output_message(mocker, content, mode="default")_create_mock_turn(mocker, output_message)_patch_llama_stack_components(mocker, mock_agent, mock_client)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
examples/lightspeed-stack-rlsapi-cla.yaml(1 hunks)src/app/routers.py(2 hunks)tests/integration/endpoints/test_rlsapi_v1_integration.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/app/routers.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests; do not use unittest framework
Unit tests must achieve 60% code coverage; integration tests must achieve 10% coverage
Files:
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
pytest-mockwith AsyncMock objects for mocking in tests
Files:
tests/integration/endpoints/test_rlsapi_v1_integration.py
🧠 Learnings (1)
📚 Learning: 2025-10-29T13:05:22.438Z
Learnt from: luis5tb
Repo: lightspeed-core/lightspeed-stack PR: 727
File: src/app/endpoints/a2a.py:43-43
Timestamp: 2025-10-29T13:05:22.438Z
Learning: In the lightspeed-stack repository, endpoint files in src/app/endpoints/ intentionally use a shared logger name "app.endpoints.handlers" rather than __name__, allowing unified logging configuration across all endpoint handlers (query.py, streaming_query.py, a2a.py).
Applied to files:
src/app/routers.py
🧬 Code graph analysis (2)
src/app/routers.py (1)
tests/unit/app/test_routers.py (1)
include_router(36-51)
tests/integration/endpoints/test_rlsapi_v1_integration.py (4)
src/app/endpoints/rlsapi_v1.py (1)
infer_endpoint(125-183)src/models/rlsapi/requests.py (6)
RlsapiV1Attachment(8-25)RlsapiV1CLA(69-86)RlsapiV1Context(89-120)RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)RlsapiV1Terminal(28-39)src/models/rlsapi/responses.py (1)
RlsapiV1InferResponse(29-53)src/utils/suid.py (1)
check_suid(19-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: library mode / azure
- GitHub Check: E2E: library mode / vertexai
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: server mode / azure
🔇 Additional comments (12)
src/app/routers.py (2)
23-24: LGTM!The import follows the established pattern and the comment clearly identifies the purpose of the new endpoint.
54-56: The router registration is correct and requires no changes. Verification confirms that the/inferendpoint defined in rlsapi_v1.py has no conflicts with existing /v1 routes, which all use distinct paths (/info, /models, /tools, /shields, /providers, /rags, /query, /config, /status, etc.).examples/lightspeed-stack-rlsapi-cla.yaml (3)
1-13: LGTM!The header clearly identifies this as a CLA configuration example, and the service configuration follows standard patterns with appropriate auth_enabled setting.
14-20: Model name is correct.The inference configuration is appropriate. The model identifier "granite-3.3-8b-instruct" is confirmed valid—it appears consistently across multiple test cases and is properly recognized by the GraniteToolParser in the codebase.
47-61: Action name "rlsapi_v1_infer" is correct and matches the authorization decorator.The authorization configuration is correct. The action name in the YAML file matches
Action.RLSAPI_V1_INFERdefined in the codebase and used by the endpoint's@authorizedecorator insrc/app/endpoints/rlsapi_v1.py:124.tests/integration/endpoints/test_rlsapi_v1_integration.py (7)
1-34: LGTM!The module docstring clearly describes the test scope, and the imports follow the coding guideline for absolute imports. The pylint disables are appropriate for test code complexity.
130-196: LGTM!The basic response tests provide good coverage of the minimal request case and various context configurations. The parametrized test efficiently validates different input combinations.
199-214: LGTM!The test effectively verifies that each request generates a unique request_id. The approach using a set comprehension to check uniqueness is clean and efficient.
222-254: LGTM!The test properly validates error handling when the Llama Stack service is unavailable, including verification of the 503 status code and error message structure.
257-287: LGTM!The test validates fallback behavior for empty or None LLM responses, ensuring the system provides a meaningful fallback message to users.
295-323: LGTM!The test validates that multiple input sources are properly combined before being sent to the LLM, which is crucial for the CLA use case where context from various sources needs to be included.
331-349: Remove the unusedskip_ragparameter or document its purpose.The
skip_ragparameter is accepted byRlsapiV1InferRequestbut is never used in the endpoint implementation. Theinfer_endpointfunction receives it but doesn't pass it toretrieve_simple_response(), and the response building code never references it. The endpoint docstring explicitly states "No conversation persistence or tools," confirming RAG is not part of the rlsapi v1 design.This creates an unused API surface that could confuse clients. Either:
- Remove the parameter entirely if RAG will never be supported in rlsapi v1
- Add a TODO comment with timeline if RAG support is planned
- Explicitly document this is for API compatibility with external systems if applicable
| # System prompt customization | ||
| # Note: These settings apply to /query endpoints. The rlsapi v1 /infer | ||
| # endpoint currently uses the default system prompt for stateless queries. | ||
| # | ||
| # Option 1: Inline system prompt | ||
| # customization: | ||
| # system_prompt: | | ||
| # You are a RHEL Linux assistant specializing in command-line tasks, | ||
| # system administration, and troubleshooting. Provide clear, actionable | ||
| # guidance for Red Hat Enterprise Linux environments. | ||
| # | ||
| # Option 2: System prompt from file | ||
| # customization: | ||
| # system_prompt_path: /etc/lightspeed/system_prompt.txt | ||
| # | ||
| # Option 3: Custom profile module (advanced) | ||
| # customization: | ||
| # profile_path: /etc/lightspeed/profiles/cla_profile.py | ||
| # # Profile module must export PROFILE_CONFIG dict with "system_prompts" key | ||
| # # containing "default", "topic_summary", and "validation" prompts. | ||
| # | ||
| # Option 4: Disable per-request prompt override (security) | ||
| # customization: | ||
| # disable_query_system_prompt: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify system prompt customization scope for CLA deployments.
The note indicates these customization options apply to /query endpoints but not to the rlsapi v1 /infer endpoint. Including these options in a CLA-specific configuration example may confuse users about which settings apply to the CLA use case.
Consider either:
- Moving these examples to a general configuration file, or
- Adding clearer guidance about when users would want both /query and /infer endpoints in the same deployment
Apply this diff to clarify the scope:
# System prompt customization
-# Note: These settings apply to /query endpoints. The rlsapi v1 /infer
-# endpoint currently uses the default system prompt for stateless queries.
+# Note: These settings apply to /query endpoints (NOT rlsapi v1 /infer).
+# Include this section only if you're deploying both /query and /infer endpoints.
+# The rlsapi v1 /infer endpoint uses the default system prompt and cannot be
+# customized per-request for stateless CLA queries.
#
# Option 1: Inline system prompt📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # System prompt customization | |
| # Note: These settings apply to /query endpoints. The rlsapi v1 /infer | |
| # endpoint currently uses the default system prompt for stateless queries. | |
| # | |
| # Option 1: Inline system prompt | |
| # customization: | |
| # system_prompt: | | |
| # You are a RHEL Linux assistant specializing in command-line tasks, | |
| # system administration, and troubleshooting. Provide clear, actionable | |
| # guidance for Red Hat Enterprise Linux environments. | |
| # | |
| # Option 2: System prompt from file | |
| # customization: | |
| # system_prompt_path: /etc/lightspeed/system_prompt.txt | |
| # | |
| # Option 3: Custom profile module (advanced) | |
| # customization: | |
| # profile_path: /etc/lightspeed/profiles/cla_profile.py | |
| # # Profile module must export PROFILE_CONFIG dict with "system_prompts" key | |
| # # containing "default", "topic_summary", and "validation" prompts. | |
| # | |
| # Option 4: Disable per-request prompt override (security) | |
| # customization: | |
| # disable_query_system_prompt: true | |
| # System prompt customization | |
| # Note: These settings apply to /query endpoints (NOT rlsapi v1 /infer). | |
| # Include this section only if you're deploying both /query and /infer endpoints. | |
| # The rlsapi v1 /infer endpoint uses the default system prompt and cannot be | |
| # customized per-request for stateless CLA queries. | |
| # | |
| # Option 1: Inline system prompt | |
| # customization: | |
| # system_prompt: | | |
| # You are a RHEL Linux assistant specializing in command-line tasks, | |
| # system administration, and troubleshooting. Provide clear, actionable | |
| # guidance for Red Hat Enterprise Linux environments. | |
| # | |
| # Option 2: System prompt from file | |
| # customization: | |
| # system_prompt_path: /etc/lightspeed/system_prompt.txt | |
| # | |
| # Option 3: Custom profile module (advanced) | |
| # customization: | |
| # profile_path: /etc/lightspeed/profiles/cla_profile.py | |
| # # Profile module must export PROFILE_CONFIG dict with "system_prompts" key | |
| # # containing "default", "topic_summary", and "validation" prompts. | |
| # | |
| # Option 4: Disable per-request prompt override (security) | |
| # customization: | |
| # disable_query_system_prompt: true |
🤖 Prompt for AI Agents
In examples/lightspeed-stack-rlsapi-cla.yaml around lines 22 to 45, the system
prompt customization block currently reads as if it applies to CLA deployments
but the comment states it only applies to /query endpoints and not rlsapi v1
/infer, which can confuse CLA users; update the file to clarify scope by either
moving these /query-specific examples to a general configuration example or by
adding a short header and inline note that explicitly states "These
customization options apply only to /query endpoints; rlsapi v1 /infer uses the
default system prompt — include both only if you are running both endpoints in
the same deployment," and optionally provide a brief sentence advising typical
CLA users which option to choose for CLA-only deployments.
Register the rlsapi v1 router in the application to expose the /v1/infer endpoint for stateless inference requests from the RHEL Lightspeed Command Line Assistant (CLA). Integration tests cover: - Successful inference with various input combinations - Request ID generation and uniqueness - Error handling for Llama Stack connection failures - Empty response fallback behavior Signed-off-by: Major Hayden <major@redhat.com>
Provide an example configuration for RHEL Lightspeed Command Line Assistant (CLA) deployments using the rlsapi v1 /infer endpoint. The example includes: - Service and Llama Stack configuration - System prompt customization options (inline, file, profile module) - Red Hat Identity authentication setup - Authorization rules for rlsapi_v1_infer action Signed-off-by: Major Hayden <major@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/integration/endpoints/test_rlsapi_v1_integration.py (3)
20-20: Consider using explicit import for consistency.While the current import works, it's inconsistent with other imports in the file and the coding guidelines preference for explicit
fromimports.Apply this diff for consistency:
-import constants +from constants import UNABLE_TO_PROCESS_RESPONSEThen update line 287:
- assert response.data.text == constants.UNABLE_TO_PROCESS_RESPONSE + assert response.data.text == UNABLE_TO_PROCESS_RESPONSE
69-114: Consider using a proper sentinel value.The use of
"default"as a sentinel value at line 79 and 91 is unconventional. Python best practice is to useNonewith explicit checks or a proper sentinel object.However, given this is test helper code and the current implementation is clear and functional, this is a minor style suggestion.
295-324: Strengthen the terminal output assertion.Line 322-323 checks for substring
"terminal"but the actual terminal output value is"terminal output"(line 312). The assertion would pass even if only part of the terminal output was included.Apply this diff for more precise validation:
- for expected in ["My question", "stdin content", "attachment content", "terminal"]: + for expected in ["My question", "stdin content", "attachment content", "terminal output"]: assert expected in message_content
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/lightspeed-stack-rlsapi-cla.yaml(1 hunks)src/app/routers.py(2 hunks)tests/integration/endpoints/test_rlsapi_v1_integration.py(1 hunks)tests/unit/app/test_routers.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/lightspeed-stack-rlsapi-cla.yaml
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/app/routers.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests; do not use unittest framework
Unit tests must achieve 60% code coverage; integration tests must achieve 10% coverage
Files:
tests/unit/app/test_routers.pytests/integration/endpoints/test_rlsapi_v1_integration.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
pytest-mockwith AsyncMock objects for mocking in tests
Files:
tests/unit/app/test_routers.pytests/integration/endpoints/test_rlsapi_v1_integration.py
🧠 Learnings (1)
📚 Learning: 2025-10-29T13:05:22.438Z
Learnt from: luis5tb
Repo: lightspeed-core/lightspeed-stack PR: 727
File: src/app/endpoints/a2a.py:43-43
Timestamp: 2025-10-29T13:05:22.438Z
Learning: In the lightspeed-stack repository, endpoint files in src/app/endpoints/ intentionally use a shared logger name "app.endpoints.handlers" rather than __name__, allowing unified logging configuration across all endpoint handlers (query.py, streaming_query.py, a2a.py).
Applied to files:
src/app/routers.py
🧬 Code graph analysis (2)
src/app/routers.py (1)
tests/unit/app/test_routers.py (1)
include_router(37-52)
tests/integration/endpoints/test_rlsapi_v1_integration.py (4)
src/app/endpoints/rlsapi_v1.py (1)
infer_endpoint(125-183)src/models/rlsapi/requests.py (6)
RlsapiV1Attachment(8-25)RlsapiV1CLA(69-86)RlsapiV1Context(89-120)RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)RlsapiV1Terminal(28-39)src/models/rlsapi/responses.py (1)
RlsapiV1InferResponse(29-53)src/utils/suid.py (1)
check_suid(19-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: server mode / vertexai
🔇 Additional comments (8)
src/app/routers.py (2)
23-24: LGTM: Clean import with helpful context.The import follows the existing pattern and the comment clearly identifies the purpose of this endpoint.
54-56: LGTM: Router registration follows established patterns.The router registration is consistent with other v1 endpoints, and the comment clearly explains this is for the stateless CLA endpoint.
tests/unit/app/test_routers.py (1)
26-26: LGTM: Test updates are thorough and consistent.All test changes properly account for the new rlsapi_v1 router:
- Import added
- Router count incremented from 16 to 17
- Presence validation added
- Prefix "/v1" validation added
Also applies to: 69-69, 88-88, 97-97, 117-117
tests/integration/endpoints/test_rlsapi_v1_integration.py (5)
37-67: LGTM: Well-structured test fixtures.The fixtures are properly documented and provide good separation of concerns for mocking different components of the system.
130-197: LGTM: Comprehensive coverage of request/response scenarios.The tests effectively cover minimal requests and various context configurations using pytest parameterization. The use of
check_suidto validate request IDs is appropriate.
199-215: LGTM: Effective uniqueness validation.The test elegantly verifies request ID uniqueness using set comprehension and validates format with
check_suid.
222-288: LGTM: Robust error handling validation.Both tests properly validate error scenarios:
- Connection errors return appropriate 503 status with structured error detail
- Empty/None responses trigger fallback messages
331-350: LGTM: Appropriate validation of skip_rag parameter.The test correctly validates that the
skip_ragparameter is accepted without breaking functionality. The comment at line 344 appropriately notes that RAG is not yet implemented in rlsapi v1.
|
/ok-to-test |
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
Description
Registers the rlsapi v1 router to expose the /v1/infer endpoint for stateless inference requests from the RHEL Lightspeed Command Line Assistant (CLA). Includes integration tests covering successful inference, error handling, and request ID generation. Also adds an example configuration file demonstrating CLA deployment with rh-identity authentication, authorization rules, and system prompt customization options.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
New tests should run using all of the existing testing commands.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.