fix: Address QA issues + product feedback #2202

athreesh · 2025-07-30T23:03:00Z

This PR addresses QA issues and triggers PR refresh.

Changes:

Changes to address QA issues
Trigger PR refresh

Ready for review and merge into main.

Summary by CodeRabbit

New Features
- Added advanced port allocation and reservation utilities for distributed services, supporting configurable port ranges and block allocation for parallelism.
Bug Fixes
- Fixed and improved relative documentation links in feature support matrices for various framework backends.
Documentation
- Reorganized and clarified README files, moving support matrices and framework links for better visibility.
- Updated SGLang installation instructions to include required pre-release dependencies.
- Removed duplicate sections and improved formatting in example documentation.
Chores
- Updated Dockerfiles to refine Python package installation steps and environment variable settings.

Co-authored-by: Anant Sharma <anants@nvidia.com> Co-authored-by: Ishan Dhanani <idhanani@nvidia.com>

copy-pr-bot · 2025-07-30T23:03:04Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Resolved conflict in container/Dockerfile.sglang by keeping the specific flashinfer-python==0.2.9rc2 installation approach from the feature branch.

coderabbitai · 2025-07-30T23:07:13Z

Caution

Review failed

Failed to post review comments.

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57482dc and aa42f9d.

📒 Files selected for processing (9)

README.md (2 hunks)
components/backends/sglang/README.md (1 hunks)
components/backends/trtllm/README.md (1 hunks)
components/backends/vllm/README.md (1 hunks)
components/backends/vllm/src/dynamo/vllm/args.py (5 hunks)
components/backends/vllm/src/dynamo/vllm/ports.py (1 hunks)
container/Dockerfile.sglang (1 hunks)
container/Dockerfile.sglang-wideep (1 hunks)
examples/README.md (2 hunks)

🧰 Additional context used

🧠 Learnings (7)

📓 Common learnings

Learnt from: biswapanda
PR: ai-dynamo/dynamo#1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.

components/backends/vllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

container/Dockerfile.sglang (1)

Learnt from: grahamking
PR: #1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.

components/backends/trtllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

components/backends/sglang/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

examples/README.md (1)

Learnt from: PeaBrane
PR: #1409
File: examples/router_standalone/worker.py:171-186
Timestamp: 2025-06-08T08:30:45.126Z
Learning: Example code in the examples/ directory may intentionally use hard-coded values or simplified implementations that wouldn't be appropriate for production code, but are acceptable for demonstration and testing purposes.

README.md (2)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

Learnt from: biswapanda
PR: #1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

🧬 Code Graph Analysis (1)

components/backends/vllm/src/dynamo/vllm/args.py (1)

components/backends/vllm/src/dynamo/vllm/ports.py (7)

DynamoPortRange (26-40)

EtcdContext (44-53)

PortAllocationRequest (78-85)

PortMetadata (57-74)

allocate_and_reserve_port (229-260)

allocate_and_reserve_port_block (142-226)

get_host_ip (263-290)

🪛 markdownlint-cli2 (0.17.2)

README.md

171-171: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (16)

components/backends/trtllm/README.md (1)

52-57: Relative links now resolve correctly – nice catch
All updated paths point three levels up, which is the correct root distance from components/backends/trtllm/README.md.

components/backends/sglang/README.md (1)

37-42: Documentation links fixed
The extra ../ aligns the paths with the file’s depth; links render correctly.

components/backends/vllm/README.md (1)

38-43: LGTM – path correction matches other back-end READMEs
Links now resolve without 404s.

examples/README.md (1)

79-79: LGTM – duplication removed cleanly

Moving the “Kubernetes cluster” prerequisite into the main bullet list eliminates the earlier duplication without altering meaning. No further action needed.

README.md (1)

24-24: Nice touch – prominent Support Matrix link

Surfacing the Support-Matrix link in the top nav improves discoverability.

components/backends/vllm/src/dynamo/vllm/ports.py (7)

25-41: LGTM!

The DynamoPortRange class properly validates port ranges with appropriate constraints for registered ports.

43-54: LGTM!

The EtcdContext class provides a clean abstraction for ETCD operations with properly formatted keys.

56-75: LGTM!

The PortMetadata class provides comprehensive metadata tracking for port reservations with useful debugging information.

88-114: LGTM!

The hold_ports context manager correctly manages socket bindings with proper cleanup.

126-140: LGTM!

The reserve_port_in_etcd function correctly reserves ports with atomic create operations and proper lease management.

229-261: LGTM!

Clean wrapper implementation that effectively reuses the block allocation logic for single ports.

263-291: LGTM!

Robust host IP detection with comprehensive error handling and fallback strategies.

components/backends/vllm/src/dynamo/vllm/args.py (4)

15-26: LGTM!

Appropriate imports from the new ports module for the refactored port management functionality.

44-44: LGTM!

Adding port_range to the Config class provides better type safety and validation for port configuration.

75-86: LGTM!

Well-structured CLI arguments for port range configuration with clear help text and sensible defaults.

Also applies to: 133-136

249-257: LGTM!

The function correctly uses the imported get_host_ip utility and properly configures NIXL environment variables.

Walkthrough

This update reorganizes and clarifies documentation for NVIDIA Dynamo and its backends, adjusts SGLang installation instructions, and introduces a new modular port allocation utility for vLLM. The vLLM backend now delegates port management to a shared module, supporting block allocation and explicit port ranges. Dockerfiles and example documentation are also refined.

Changes

Cohort / File(s)	Change Summary
Main Documentation Reorganization `README.md`, `examples/README.md`	Reorganizes framework support matrix and related links, consolidates duplicated sections, and improves visibility of framework-specific deployment information.
SGLang Installation and Dockerfile Updates `README.md`, `container/Dockerfile.sglang`	Updates SGLang install instructions to require a pre-release Python package, changes a system dependency name, and adjusts Dockerfile pip install steps.
SGLang Dockerfile Editable Install Removal `container/Dockerfile.sglang-wideep`	Changes package installation from editable to standard mode.
Backend Feature Matrix Link Fixes `components/backends/sglang/README.md`, `components/backends/trtllm/README.md`, `components/backends/vllm/README.md`	Fixes relative documentation links in feature support matrices for all backends.
vLLM Port Allocation Refactor `components/backends/vllm/src/dynamo/vllm/args.py`	Refactors port allocation to use a new shared utility, adds port range configuration, supports block allocation, and removes internal allocation logic.
New Port Allocation Utility Module `components/backends/vllm/src/dynamo/vllm/ports.py`	Introduces a module for robust port allocation and reservation with ETCD coordination, supporting block allocation, port range enforcement, and host IP resolution.

Sequence Diagram(s)

sequenceDiagram
    participant CLI/User
    participant vLLM Args Parser
    participant Ports Module
    participant ETCD

    CLI/User->>vLLM Args Parser: Launch vLLM with port range args
    vLLM Args Parser->>Ports Module: Request port/block allocation (with metadata)
    Ports Module->>Ports Module: Bind/check local ports
    Ports Module->>ETCD: Reserve port(s) with metadata
    ETCD-->>Ports Module: Confirm reservation
    Ports Module-->>vLLM Args Parser: Return allocated port(s)
    vLLM Args Parser-->>CLI/User: Update config with allocated ports

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

fix: add better port logic (#2175) #2192: Refactors vLLM port allocation logic and introduces the shared ports module, directly overlapping with this PR’s backend changes.
fix(container): update sgl dockerfile install commands #2194: Updates SGLang Dockerfile installation commands, matching the Dockerfile and install logic changes here.
docs: update readme install instructions #2170: Modifies SGLang installation instructions in the documentation, closely related to README changes in this PR.

Poem

In the warren of code, we tidy and mend,
Ports now reserved, with ETCD as friend.
Docs leap to the top, support clear and bright,
Dockerfiles pruned for a sleeker byte.
Rabbits rejoice, as clarity grows—
Hopping through changes, where good logic flows! 🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Remove the specific flashinfer-python installation and revert to the standard ai-dynamo[sglang] --pre installation from main branch.

nealvaidya · 2025-07-31T00:45:46Z

container/Dockerfile.sglang


 ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
-CMD []
+CMD []


Still some EOF issue

… at top

alec-flowers and others added 4 commits July 30, 2025 22:07

fix: add better port logic (#2175) (#2192)

992adfb

chore: fix install (#2191)

9a93f11

Co-authored-by: Anant Sharma <anants@nvidia.com> Co-authored-by: Ishan Dhanani <idhanani@nvidia.com>

changes to address QA issues

5d4b3ca

trigger PR refresh

aa42f9d

athreesh requested review from a team, GuanLuo, alec-flowers, biswapanda, grahamking, ishandhanani, nealvaidya, nnshah1, paulhendricks, piotrm-nvidia, ptarasiewiczNV, richardhuo-nv, rmccorm4, ryanolson, tanmayv25, tmonty12 and whoisj as code owners July 30, 2025 23:03

athreesh requested review from PeaBrane, jthomson04, kkranen and tedzhouhk as code owners July 30, 2025 23:03

pull-request-size bot added the size/XL label Jul 30, 2025

Merge main into anish/fix

760d406

Resolved conflict in container/Dockerfile.sglang by keeping the specific flashinfer-python==0.2.9rc2 installation approach from the feature branch.

pull-request-size bot removed the size/XL label Jul 30, 2025

pull-request-size bot added the size/M label Jul 30, 2025

athreesh changed the title ~~Fix: Address QA issues~~ fix: Address QA issues + product feedback Jul 30, 2025

github-actions bot added the fix label Jul 30, 2025

Revert Dockerfile.sglang changes to match main

dbb1eaf

Remove the specific flashinfer-python installation and revert to the standard ai-dynamo[sglang] --pre installation from main branch.

athreesh force-pushed the anish/fix branch from daf4e25 to dbb1eaf Compare July 31, 2025 00:41

same dockerfile as one in main

c8bd069

nealvaidya reviewed Jul 31, 2025

View reviewed changes

container/Dockerfile.sglang

ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]

CMD []

CMD []

Copy link

Contributor

nealvaidya Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some EOF issue

athreesh added 4 commits August 4, 2025 20:58

added containers & helm charts

d292048

fix merge conflict + edit exampels readme

fcc114d

Resolve merge conflict in examples/README.md - keep Framework Support…

52774a0

… at top

adding a readme for benchmarks folder

3e7186d

pull-request-size bot added size/L and removed size/M labels Aug 5, 2025

athreesh closed this Aug 27, 2025

athreesh deleted the anish/fix branch August 27, 2025 04:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Address QA issues + product feedback #2202

fix: Address QA issues + product feedback #2202

Uh oh!

athreesh commented Jul 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Jul 30, 2025

Uh oh!

coderabbitai bot commented Jul 30, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

nealvaidya Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix: Address QA issues + product feedback #2202

fix: Address QA issues + product feedback #2202

Uh oh!

Conversation

athreesh commented Jul 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Jul 30, 2025

Uh oh!

coderabbitai bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

nealvaidya Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

athreesh commented Jul 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 30, 2025 •

edited

Loading