Skip to content

LCORE-792: Lightspeed core needs to fully support VertexAI LLM provider#924

Merged
tisnik merged 1 commit intolightspeed-core:mainfrom
are-ces:support-vertexai
Dec 16, 2025
Merged

LCORE-792: Lightspeed core needs to fully support VertexAI LLM provider#924
tisnik merged 1 commit intolightspeed-core:mainfrom
are-ces:support-vertexai

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Dec 16, 2025

Description

  • Added e2e tests for VertexAI as provider
  • Added VertexAI example config
  • Updated docs
  • Updated fallback model to gpt-4o-mini (way cheaper than gpt-4-turbo)

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue # LCORE-792
  • Closes # LCORE-792

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added VertexAI as a supported AI provider with Google Cloud authentication integration.
    • Enhanced configuration structure with improved directory-based organization.
  • Documentation

    • Updated provider compatibility tables and prerequisites documentation to include VertexAI.
  • Tests

    • Added end-to-end testing support for VertexAI environments.
    • Updated default model selection for test runs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

This PR adds VertexAI as a supported LLM provider to the e2e testing framework and infrastructure. Changes include adding VertexAI to the CI workflow matrix, restructuring configuration file paths to use a mode-based directory structure, integrating Google Cloud credentials handling, updating Docker compose services with GCP environment variables and volume mounts, and creating new VertexAI configuration examples and test setups.

Changes

Cohort / File(s) Summary
CI/CD Workflow Configuration
.github/workflows/e2e_tests.yaml
Adds VertexAI to the E2E test environment matrix; reworks config file path loading to mode-based structure (${MODE}-mode/lightspeed-stack.yaml); introduces GCP service account key validation and credential steps; propagates VertexAI environment variables (VERTEX_AI_PROJECT, VERTEX_AI_LOCATION) and GOOGLE_APPLICATION_CREDENTIALS through server and library jobs.
Docker Compose Infrastructure
docker-compose.yaml, docker-compose-library.yaml
Adds GCP key volume mount (${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy} to /opt/app-root/.gcp-keys:ro); introduces new environment variables for Vertex AI (GOOGLE_APPLICATION_CREDENTIALS, VERTEX_AI_PROJECT, VERTEX_AI_LOCATION) and search APIs (BRAVE_SEARCH_API_KEY, TAVILY_SEARCH_API_KEY).
Example and E2E Configurations
examples/vertexai-run.yaml, tests/e2e/configs/run-vertexai.yaml
Major restructuring of VertexAI inference configurations; version bumped to 2; introduces modular top-level sections (storage, inference_store, metadata_store, conversations_store); reworks provider topology with multi-provider support (Vertex AI, sentence-transformers, llama-guard, faiss); defines SQLite-backed storage backends and resource mappings for e2e testing.
Mode-based Configuration Files
tests/e2e/configuration/library-mode/lightspeed-stack.yaml, tests/e2e/configuration/server-mode/lightspeed-stack.yaml
Minor formatting adjustments (whitespace normalization in authentication module declarations).
Documentation
README.md, docs/providers.md
Adds VertexAI to provider prerequisites and LLM compatibility tables; updates VertexAI provider dependencies from litellm, google-cloud-aiplatform (unsupported) to google-auth (supported).
Test Infrastructure
tests/e2e/features/environment.py
Updates default model fallback from "gpt-4-turbo" to "gpt-4o-mini"; modifies fallback log message to dynamically include provider and model information.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

  • The e2e_tests.yaml workflow introduces conditional GCP credential handling and mode-based configuration path logic that requires careful verification of environment variable propagation across jobs
  • vertexai-run.yaml and run-vertexai.yaml contain significant structural reorganization and new nested configuration schemas that need validation against the expected runtime behavior
  • Multiple coordinated environment variable additions across Docker compose and CI workflow should be cross-referenced for consistency
  • Default model fallback change in environment.py should be verified against test expectations

Possibly related PRs

  • PR #906: Modifies e2e_tests workflow to replace mode-based CONFIG_MODE with CONFIG_ENVIRONMENT, directly intersecting with this PR's mode-based configuration path changes
  • PR #654: Adds analogous cloud provider integration (Azure) with similar changes to CI workflow, Docker compose, documentation, and example configurations
  • PR #864: Reworks e2e_tests to support mode-specific configuration loading and environment variable propagation, overlapping with config path restructuring in this PR

Suggested reviewers

  • radofuchs
  • tisnik

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main objective: adding full VertexAI LLM provider support to Lightspeed core, which is directly reflected in the changeset across workflow configs, documentation, examples, and e2e tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@are-ces
Copy link
Contributor Author

are-ces commented Dec 16, 2025

e2e tests fail here because of missing file, PTAL at the test results in my fork

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
README.md (1)

179-181: Consider using descriptive link text in the footnote.

The link text "here" is not descriptive, which can reduce accessibility and clarity. Consider using more meaningful text like "the llama-stack source code".

-[^1]: List of models is limited by design in llama-stack, future versions will probably allow to use more models (see [here](https://github.com/llamastack/llama-stack/blob/release-0.3.x/llama_stack/providers/remote/inference/vertexai/vertexai.py#L54))
+[^1]: List of models is limited by design in llama-stack, future versions will probably allow to use more models (see [llama-stack vertexai provider](https://github.com/llamastack/llama-stack/blob/release-0.3.x/llama_stack/providers/remote/inference/vertexai/vertexai.py#L54))
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db34bd5 and b825413.

📒 Files selected for processing (10)
  • .github/workflows/e2e_tests.yaml (5 hunks)
  • README.md (2 hunks)
  • docker-compose-library.yaml (1 hunks)
  • docker-compose.yaml (1 hunks)
  • docs/providers.md (1 hunks)
  • examples/vertexai-run.yaml (1 hunks)
  • tests/e2e/configs/run-vertexai.yaml (1 hunks)
  • tests/e2e/configuration/library-mode/lightspeed-stack.yaml (1 hunks)
  • tests/e2e/configuration/server-mode/lightspeed-stack.yaml (1 hunks)
  • tests/e2e/features/environment.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest-mock with AsyncMock objects for mocking in tests

Files:

  • tests/e2e/features/environment.py
🧠 Learnings (3)
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

  • tests/e2e/configuration/server-mode/lightspeed-stack.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack.yaml
📚 Learning: 2025-09-02T11:15:02.411Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/test_list.txt:2-3
Timestamp: 2025-09-02T11:15:02.411Z
Learning: In the lightspeed-stack e2e tests, the Authorized tag is intentionally omitted from noop authentication tests because they are designed to test against the default lightspeed-stack.yaml configuration rather than the specialized noop-with-token configuration.

Applied to files:

  • tests/e2e/configuration/server-mode/lightspeed-stack.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack.yaml
📚 Learning: 2025-08-19T08:57:27.714Z
Learnt from: onmete
Repo: lightspeed-core/lightspeed-stack PR: 417
File: src/lightspeed_stack.py:60-63
Timestamp: 2025-08-19T08:57:27.714Z
Learning: In the lightspeed-stack project, file permission hardening (chmod 0o600) for stored configuration JSON files is not required as it's not considered a security concern in their deployment environment.

Applied to files:

  • tests/e2e/configuration/library-mode/lightspeed-stack.yaml
🪛 markdownlint-cli2 (0.18.1)
README.md

124-124: Bare URL used

(MD034, no-bare-urls)


181-181: Link text should be descriptive

(MD059, descriptive-link-text)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (18)
docs/providers.md (1)

57-57: LGTM!

The VertexAI provider entry is correctly added to the Inference Providers table with appropriate dependency (google-auth) and support status (). This aligns with the broader PR changes introducing VertexAI support.

README.md (1)

124-124: LGTM!

The VertexAI provider entry is correctly added to the prerequisites table, consistent with existing entries.

docker-compose.yaml (2)

13-13: LGTM!

The GCP keys volume mount is appropriately configured as read-only with a sensible default to a dummy directory for non-VertexAI environments.


31-34: The GOOGLE_APPLICATION_CREDENTIALS environment variable is already correctly configured to point to the container path /opt/app-root/.gcp-keys/gcp-key.json. The docker-compose.yaml volume mount (${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}:/opt/app-root/.gcp-keys:ro) properly maps the host credentials directory to the container path, and the e2e workflow (line 130) explicitly sets the variable to the container path before passing it to docker-compose. No changes needed.

Likely an incorrect or invalid review comment.

docker-compose-library.yaml (2)

17-36: LGTM!

The environment variables are consistently structured with appropriate defaults. The VertexAI variables follow the same pattern as other providers.


15-15: Verify that GOOGLE_APPLICATION_CREDENTIALS is set to /opt/app-root/.gcp-keys in the environment configuration. While the volume mounts to a different path than the container's WORKDIR, this is not inherently problematic—what matters is that the environment variable explicitly points to the mounted credentials file path.

examples/vertexai-run.yaml (4)

1-1: LGTM!

The version 2 configuration schema is appropriate for the current llama-stack integration.


76-81: LGTM!

The VertexAI inference provider is correctly configured with environment variable templating for project and location, which aligns with the docker-compose environment variable definitions.


111-133: LGTM!

The storage configuration is well-structured with appropriate sqlite backends for both key-value and SQL stores. The store mappings for metadata, inference, conversations, and prompts are properly defined.


134-140: Remove: The empty models array is intentional and consistent with repository patterns.

After examining the codebase, no other example configurations in the repository contain a registered_resources section at all. The pattern in vertexai-run.yaml—with all resource arrays (models, shields, vector_dbs, datasets, scoring_fns, benchmarks) empty except for tool_groups—represents an intentional default configuration template. The assumption that "other example configurations typically pre-register models" does not hold in this repository. No change or comment is required.

Likely an incorrect or invalid review comment.

tests/e2e/features/environment.py (2)

58-58: Good addition - comment improves clarity.

The comment effectively explains the purpose of the model detection logic that follows.


70-75: Dynamic messaging improvement approved; model change verified.

The dynamic fallback message is more informative. GPT-4o Mini is roughly 53.3x cheaper compared to GPT-4 Turbo for input and output tokens, making this a sound cost optimization.

.github/workflows/e2e_tests.yaml (4)

13-13: LGTM - VertexAI added to test matrix.

The addition of vertexai to the environment matrix enables comprehensive testing across all supported providers.


52-65: LGTM - improved config path structure.

The restructuring to ${MODE}-mode/lightspeed-stack.yaml provides better organization through directory-based mode separation.


93-131: Well-implemented VertexAI credentials handling.

The service account key setup includes good practices:

  • Base64 encoding prevents GitHub from modifying the key
  • JSON validation catches malformed keys early
  • Clear error messages for debugging

The implementation correctly sets both GOOGLE_APPLICATION_CREDENTIALS and GCP_KEYS_PATH for downstream use.


194-240: LGTM - VertexAI environment variables properly propagated.

The VertexAI-related environment variables (VERTEX_AI_LOCATION, VERTEX_AI_PROJECT, GOOGLE_APPLICATION_CREDENTIALS) are correctly propagated to both server and library mode runs, ensuring consistent configuration across deployment modes.

Note: The GCP_KEYS_PATH issue for non-VertexAI environments is addressed in the previous comment.

tests/e2e/configs/run-vertexai.yaml (2)

1-143: Well-structured VertexAI configuration.

The configuration file is comprehensive and properly structured, including:

  • All required API modules and providers
  • Appropriate storage backend configurations
  • Proper namespace and path definitions
  • Complete registered resources sections

The file provides a solid foundation for VertexAI end-to-end testing.


76-81: Configuration syntax is correct.

The ${env.VERTEX_AI_PROJECT} and ${env.VERTEX_AI_LOCATION} syntax aligns with llama-stack's documented environment variable substitution format.

Comment on lines +187 to +192
- name: Create dummy GCP keys directory
if: matrix.environment != 'vertexai'
run: |
echo "Creating dummy GCP keys directory for non-VertexAI environment..."
mkdir -p ./tmp/.gcp-keys-dummy
echo "✅ Dummy directory created."
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: GCP_KEYS_PATH not set for non-VertexAI environments.

The dummy directory is created, but the GCP_KEYS_PATH environment variable is never set for non-VertexAI environments (ci, azure). This causes GCP_KEYS_PATH to be undefined when passed to docker-compose at lines 202 and 228, which may lead to volume mount failures.

Apply this fix:

 - name: Create dummy GCP keys directory
   if: matrix.environment != 'vertexai'
   run: |
     echo "Creating dummy GCP keys directory for non-VertexAI environment..."
     mkdir -p ./tmp/.gcp-keys-dummy
+    echo "GCP_KEYS_PATH=./tmp/.gcp-keys-dummy" >> $GITHUB_ENV
     echo "✅ Dummy directory created."
🤖 Prompt for AI Agents
.github/workflows/e2e_tests.yaml around lines 187 to 192: the dummy GCP keys
directory is created for non-VertexAI environments but GCP_KEYS_PATH is not set,
so downstream steps (docker-compose) receive an undefined path; update the
workflow to export GCP_KEYS_PATH pointing to the created dummy directory so
subsequent steps see it (for example, append
"GCP_KEYS_PATH=./tmp/.gcp-keys-dummy" to $GITHUB_ENV or set it in the step's
env) ensuring the variable is available to later steps that use it for
docker-compose volume mounts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is addressed by setting a fallback GCP_KEYS_PATH in the docker-compose.yaml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the GCP_KEYS_PATH fallback in docker-compose files
echo "=== Checking docker-compose.yaml ==="
grep -n "GCP_KEYS_PATH" docker-compose.yaml

echo -e "\n=== Checking docker-compose-library.yaml ==="
grep -n "GCP_KEYS_PATH" docker-compose-library.yaml

Length of output: 238


@are-ces You're absolutely right! I can see both docker-compose.yaml and docker-compose-library.yaml use the syntax ${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}, which provides a fallback value when the environment variable is unset. This means the volume mount will work correctly for non-VertexAI environments even without explicitly setting GCP_KEYS_PATH in the workflow.

My original concern was incorrect—the code already handles this case properly. Thank you for the clarification! 👍

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@radofuchs radofuchs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants