Skip to content

Conversation

@nv-tusharma
Copy link
Contributor

@nv-tusharma nv-tusharma commented Sep 3, 2025

Overview:

Recent changes to Dockerfile.vllm resulted in the deletion of the dev stage in favor of local-dev stage. This was to align with the .devcontainer setup. However, to keep parity with other backends, the decision was made to keep the dev stage instead of local-dev. This PR updates the Dockerfile to use the dev stage as default instead of local-dev.

e432ae4

Details:

  • Use dev as default stage instead of local-dev.

Where should the reviewer start?

  • .devcontainer/README.md
  • .devcontainer/devcontainer.json
  • container/Dockerfile.vllm
  • container/build.sh

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Integrated prefill path for vLLM generation to enable faster first-token response and KV transfer for subsequent decoding.
  • Documentation

    • Introduced a comprehensive installation guide and updated links across guides.
    • Added API reference generation notes and prerequisites for profiling/SLA planner.
    • Streamlined Helm docs, replacing legacy deployment instructions with chart-focused guidance.
  • Chores

    • Bumped Helm chart versions to 0.5.0.
    • Updated devcontainer image tag and build target naming.
    • Adjusted platform defaults to support insecure etcd images and legacy repository.
    • Removed deprecated Helm deploy scripts and sample values.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 3, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8d54eb7 and 03f9a5d.

📒 Files selected for processing (33)
  • .devcontainer/README.md (1 hunks)
  • .devcontainer/devcontainer.json (1 hunks)
  • benchmarks/profiler/README.md (1 hunks)
  • components/backends/vllm/src/dynamo/vllm/handlers.py (2 hunks)
  • container/Dockerfile.vllm (1 hunks)
  • container/build.sh (1 hunks)
  • deploy/README.md (0 hunks)
  • deploy/README.md (1 hunks)
  • deploy/cloud/helm/README.md (1 hunks)
  • deploy/cloud/helm/crds/Chart.yaml (1 hunks)
  • deploy/cloud/helm/crds/README.md (1 hunks)
  • deploy/cloud/helm/deploy.sh (0 hunks)
  • deploy/cloud/helm/dynamo-platform-values.yaml (0 hunks)
  • deploy/cloud/helm/network-config-wizard.sh (0 hunks)
  • deploy/cloud/helm/platform/README.md (1 hunks)
  • deploy/cloud/helm/platform/README.md.gotmpl (1 hunks)
  • deploy/cloud/helm/platform/values.yaml (1 hunks)
  • deploy/cloud/operator/Makefile (1 hunks)
  • deploy/cloud/operator/README.md (1 hunks)
  • deploy/cloud/operator/docs/header.md (1 hunks)
  • deploy/helm/chart/Chart.yaml (1 hunks)
  • docs/architecture/sla_planner.md (1 hunks)
  • docs/benchmarks/pre_deployment_profiling.md (1 hunks)
  • docs/guides/dynamo_deploy/README.md (3 hunks)
  • docs/guides/dynamo_deploy/api_reference.md (1 hunks)
  • docs/guides/dynamo_deploy/dynamo_operator.md (2 hunks)
  • docs/guides/dynamo_deploy/gke_setup.md (1 hunks)
  • docs/guides/dynamo_deploy/grove.md (1 hunks)
  • docs/guides/dynamo_deploy/installation_guide.md (4 hunks)
  • docs/guides/dynamo_deploy/metrics.md (1 hunks)
  • docs/guides/dynamo_deploy/minikube.md (1 hunks)
  • docs/guides/dynamo_deploy/sla_planner_deployment.md (1 hunks)
  • docs/index.rst (1 hunks)
💤 Files with no reviewable changes (3)
  • deploy/cloud/helm/dynamo-platform-values.yaml
  • deploy/cloud/helm/deploy.sh
  • deploy/cloud/helm/network-config-wizard.sh
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
📚 Learning: 2025-07-18T16:04:31.771Z
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.

Applied to files:

  • deploy/cloud/helm/crds/README.md
  • deploy/cloud/helm/README.md
  • deploy/cloud/helm/crds/Chart.yaml
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

  • .devcontainer/devcontainer.json
  • .devcontainer/README.md
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.

Applied to files:

  • .devcontainer/devcontainer.json
  • .devcontainer/README.md
  • container/Dockerfile.vllm
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project's devcontainer setup, hard-coded container names in devcontainer.json files serve as templates that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

  • .devcontainer/devcontainer.json
  • .devcontainer/README.md
📚 Learning: 2025-09-03T01:10:12.599Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.

Applied to files:

  • .devcontainer/devcontainer.json
  • .devcontainer/README.md
  • container/build.sh
🪛 LanguageTool
deploy/cloud/helm/platform/README.md

[grammar] ~104-~104: There might be a mistake here.
Context: ...namo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md) - [NATS Documentation](https://docs.nats.io...

(QB_NEW_EN)

docs/guides/dynamo_deploy/dynamo_operator.md

[grammar] ~41-~41: There might be a mistake here.
Context: ... Cloud](installation_guide.md) installed - [FluxCD](https://fluxcd.io/flux/installat...

(QB_NEW_EN)

docs/guides/dynamo_deploy/README.md

[grammar] ~45-~45: There might be a mistake here.
Context: ... | Backend | Available Configurations | |---------|--------------------------| |...

(QB_NEW_EN)


[grammar] ~46-~46: There might be a mistake here.
Context: ...| |---------|--------------------------| | **[vLLM](/components/backends/vllm/dep...

(QB_NEW_EN)


[grammar] ~47-~47: There might be a mistake here.
Context: ...ated + Router, Disaggregated + Planner | | **[SGLang](/components/backends/sglang...

(QB_NEW_EN)


[grammar] ~48-~48: There might be a mistake here.
Context: ...ed + Planner, Disaggregated Multi-node | | **[TensorRT-LLM](/components/backends/...

(QB_NEW_EN)


[grammar] ~82-~82: There might be a mistake here.
Context: ...cations for DynamoGraphDeployment and DynamoComponentDeployment - **[Operator Guide](/docs/guides/dynamo_depl...

(QB_NEW_EN)


[grammar] ~83-~83: There might be a mistake here.
Context: ...mo operator configuration and management - **[Create Deployment](/docs/guides/dynamo_d...

(QB_NEW_EN)

docs/guides/dynamo_deploy/installation_guide.md

[grammar] ~24-~24: There might be a mistake here.
Context: ...uick Start Paths Platform is installed using Dynamo Kubernetes Platform [helm chart]...

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and Test - vllm
  • GitHub Check: Build and Test - dynamo

Walkthrough

Renamed local development image/target from local-dev to dev across devcontainer and build scripts. Updated Helm charts/docs, removed legacy deployment scripts/values, and revised installation guides/links. Bumped Helm versions. Added vLLM prefill integration guarded by can_prefill and improved error handling. Minor etcd image/config updates in platform values.

Changes

Cohort / File(s) Summary of Changes
Devcontainer and build targets
\.devcontainer/README.md, \.devcontainer/devcontainer.json, container/Dockerfile.vllm, container/build.sh
Switched image/tag and build stage from local-dev to dev; updated docs and USER_UID/GID injection condition.
vLLM prefill integration
components/backends/vllm/src/dynamo/vllm/handlers.py
Added can_prefill=0 fallback, broader exception logging, and a prefill path: single-token prefill, kv transfer params propagation, then normal generate flow.
Helm cloud deployment restructuring
deploy/cloud/helm/README.md, deploy/cloud/helm/deploy.sh (removed), deploy/cloud/helm/dynamo-platform-values.yaml (removed), deploy/cloud/helm/network-config-wizard.sh (removed), deploy/cloud/helm/crds/Chart.yaml, deploy/cloud/helm/crds/README.md, deploy/cloud/helm/platform/README.md, deploy/cloud/helm/platform/README.md.gotmpl, deploy/cloud/helm/platform/values.yaml
Consolidated docs to platform/crds charts; removed legacy deploy scripts/values/wizard; bumped CRDs chart to 0.5.0; updated links; added etcd insecure images setting and legacy repo.
Top-level Helm chart metadata
deploy/helm/chart/Chart.yaml
Bumped version and appVersion from 0.4.1 to 0.5.0.
Operator docs build and headers
deploy/cloud/operator/Makefile, deploy/cloud/operator/README.md, deploy/cloud/operator/docs/header.md
Adjusted API docs output path and post-process to docs/guides; added Install link; replaced header with auto-generated warnings.
Docs: installation flow and links
deploy/README.md, benchmarks/profiler/README.md, docs/architecture/sla_planner.md, docs/benchmarks/pre_deployment_profiling.md, docs/guides/dynamo_deploy/README.md, docs/guides/dynamo_deploy/api_reference.md, docs/guides/dynamo_deploy/dynamo_operator.md, docs/guides/dynamo_deploy/gke_setup.md, docs/guides/dynamo_deploy/grove.md, docs/guides/dynamo_deploy/installation_guide.md, docs/guides/dynamo_deploy/metrics.md, docs/guides/dynamo_deploy/minikube.md, docs/guides/dynamo_deploy/sla_planner_deployment.md, docs/index.rst
Updated installation to Helm-based steps with RELEASE_VERSION; converted links to new locations/absolute paths; added profiling notes; adjusted wording and references; added SPDX/warnings.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant Handler as vLLM Handler
    participant Prefill as Prefill Worker
    participant Decoder as Decoder/Generator

    Client->>Handler: generate(request, sampling_params)
    alt can_prefill == 1
        Note over Handler: Build single-token prefill params<br/>extra_args.kv_transfer_params.do_remote_decode=true
        Handler->>Prefill: round_robin(prefill_request)
        Prefill-->>Handler: prefill_response (kv_transfer_params)
        Note over Handler: Merge kv_transfer_params into sampling_params.extra_args
    else can_prefill == 0
        Note over Handler: Skip prefill
    end
    Handler->>Decoder: generate(request, sampling_params)
    Decoder-->>Handler: tokens/outputs
    Handler-->>Client: MyRequestOutput
    opt Errors
        Note over Handler: Logs exceptions (including CancelledError)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

A rabbit taps helm charts with care,
Snips old scripts, fresh guides to share.
Prefill hops first, one token to see,
Then bounds to decode—swift as can be.
Etcd burrows to legacy nook,
Version bumps penned in the book.
Ship it—thump-thump—on we look! 🐇🚀


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants