Skip to content

Conversation

@athreesh
Copy link
Contributor

@athreesh athreesh commented Jul 30, 2025

This PR addresses QA issues and triggers PR refresh.

Changes:

  • Changes to address QA issues
  • Trigger PR refresh

Ready for review and merge into main.

Summary by CodeRabbit

  • New Features

    • Added advanced port allocation and reservation utilities for distributed services, supporting configurable port ranges and block allocation for parallelism.
  • Bug Fixes

    • Fixed and improved relative documentation links in feature support matrices for various framework backends.
  • Documentation

    • Reorganized and clarified README files, moving support matrices and framework links for better visibility.
    • Updated SGLang installation instructions to include required pre-release dependencies.
    • Removed duplicate sections and improved formatting in example documentation.
  • Chores

    • Updated Dockerfiles to refine Python package installation steps and environment variable settings.

alec-flowers and others added 4 commits July 30, 2025 22:07
Co-authored-by: Anant Sharma <anants@nvidia.com>
Co-authored-by: Ishan Dhanani <idhanani@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 30, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Resolved conflict in container/Dockerfile.sglang by keeping the specific
flashinfer-python==0.2.9rc2 installation approach from the feature branch.
@athreesh athreesh changed the title Fix: Address QA issues fix: Address QA issues + product feedback Jul 30, 2025
@github-actions github-actions bot added the fix label Jul 30, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 30, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57482dc and aa42f9d.

📒 Files selected for processing (9)
  • README.md (2 hunks)
  • components/backends/sglang/README.md (1 hunks)
  • components/backends/trtllm/README.md (1 hunks)
  • components/backends/vllm/README.md (1 hunks)
  • components/backends/vllm/src/dynamo/vllm/args.py (5 hunks)
  • components/backends/vllm/src/dynamo/vllm/ports.py (1 hunks)
  • container/Dockerfile.sglang (1 hunks)
  • container/Dockerfile.sglang-wideep (1 hunks)
  • examples/README.md (2 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: biswapanda
PR: ai-dynamo/dynamo#1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
components/backends/vllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

container/Dockerfile.sglang (1)

Learnt from: grahamking
PR: #1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.

components/backends/trtllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

components/backends/sglang/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

examples/README.md (1)

Learnt from: PeaBrane
PR: #1409
File: examples/router_standalone/worker.py:171-186
Timestamp: 2025-06-08T08:30:45.126Z
Learning: Example code in the examples/ directory may intentionally use hard-coded values or simplified implementations that wouldn't be appropriate for production code, but are acceptable for demonstration and testing purposes.

README.md (2)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

Learnt from: biswapanda
PR: #1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

🧬 Code Graph Analysis (1)
components/backends/vllm/src/dynamo/vllm/args.py (1)
components/backends/vllm/src/dynamo/vllm/ports.py (7)
  • DynamoPortRange (26-40)
  • EtcdContext (44-53)
  • PortAllocationRequest (78-85)
  • PortMetadata (57-74)
  • allocate_and_reserve_port (229-260)
  • allocate_and_reserve_port_block (142-226)
  • get_host_ip (263-290)
🪛 markdownlint-cli2 (0.17.2)
README.md

171-171: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (16)
components/backends/trtllm/README.md (1)

52-57: Relative links now resolve correctly – nice catch
All updated paths point three levels up, which is the correct root distance from components/backends/trtllm/README.md.

components/backends/sglang/README.md (1)

37-42: Documentation links fixed
The extra ../ aligns the paths with the file’s depth; links render correctly.

components/backends/vllm/README.md (1)

38-43: LGTM – path correction matches other back-end READMEs
Links now resolve without 404s.

examples/README.md (1)

79-79: LGTM – duplication removed cleanly

Moving the “Kubernetes cluster” prerequisite into the main bullet list eliminates the earlier duplication without altering meaning. No further action needed.

README.md (1)

24-24: Nice touch – prominent Support Matrix link

Surfacing the Support-Matrix link in the top nav improves discoverability.

components/backends/vllm/src/dynamo/vllm/ports.py (7)

25-41: LGTM!

The DynamoPortRange class properly validates port ranges with appropriate constraints for registered ports.


43-54: LGTM!

The EtcdContext class provides a clean abstraction for ETCD operations with properly formatted keys.


56-75: LGTM!

The PortMetadata class provides comprehensive metadata tracking for port reservations with useful debugging information.


88-114: LGTM!

The hold_ports context manager correctly manages socket bindings with proper cleanup.


126-140: LGTM!

The reserve_port_in_etcd function correctly reserves ports with atomic create operations and proper lease management.


229-261: LGTM!

Clean wrapper implementation that effectively reuses the block allocation logic for single ports.


263-291: LGTM!

Robust host IP detection with comprehensive error handling and fallback strategies.

components/backends/vllm/src/dynamo/vllm/args.py (4)

15-26: LGTM!

Appropriate imports from the new ports module for the refactored port management functionality.


44-44: LGTM!

Adding port_range to the Config class provides better type safety and validation for port configuration.


75-86: LGTM!

Well-structured CLI arguments for port range configuration with clear help text and sensible defaults.

Also applies to: 133-136


249-257: LGTM!

The function correctly uses the imported get_host_ip utility and properly configures NIXL environment variables.

Walkthrough

This update reorganizes and clarifies documentation for NVIDIA Dynamo and its backends, adjusts SGLang installation instructions, and introduces a new modular port allocation utility for vLLM. The vLLM backend now delegates port management to a shared module, supporting block allocation and explicit port ranges. Dockerfiles and example documentation are also refined.

Changes

Cohort / File(s) Change Summary
Main Documentation Reorganization
README.md, examples/README.md
Reorganizes framework support matrix and related links, consolidates duplicated sections, and improves visibility of framework-specific deployment information.
SGLang Installation and Dockerfile Updates
README.md, container/Dockerfile.sglang
Updates SGLang install instructions to require a pre-release Python package, changes a system dependency name, and adjusts Dockerfile pip install steps.
SGLang Dockerfile Editable Install Removal
container/Dockerfile.sglang-wideep
Changes package installation from editable to standard mode.
Backend Feature Matrix Link Fixes
components/backends/sglang/README.md, components/backends/trtllm/README.md, components/backends/vllm/README.md
Fixes relative documentation links in feature support matrices for all backends.
vLLM Port Allocation Refactor
components/backends/vllm/src/dynamo/vllm/args.py
Refactors port allocation to use a new shared utility, adds port range configuration, supports block allocation, and removes internal allocation logic.
New Port Allocation Utility Module
components/backends/vllm/src/dynamo/vllm/ports.py
Introduces a module for robust port allocation and reservation with ETCD coordination, supporting block allocation, port range enforcement, and host IP resolution.

Sequence Diagram(s)

sequenceDiagram
    participant CLI/User
    participant vLLM Args Parser
    participant Ports Module
    participant ETCD

    CLI/User->>vLLM Args Parser: Launch vLLM with port range args
    vLLM Args Parser->>Ports Module: Request port/block allocation (with metadata)
    Ports Module->>Ports Module: Bind/check local ports
    Ports Module->>ETCD: Reserve port(s) with metadata
    ETCD-->>Ports Module: Confirm reservation
    Ports Module-->>vLLM Args Parser: Return allocated port(s)
    vLLM Args Parser-->>CLI/User: Update config with allocated ports
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

Poem

In the warren of code, we tidy and mend,
Ports now reserved, with ETCD as friend.
Docs leap to the top, support clear and bright,
Dockerfiles pruned for a sleeker byte.
Rabbits rejoice, as clarity grows—
Hopping through changes, where good logic flows! 🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Remove the specific flashinfer-python installation and revert to the
standard ai-dynamo[sglang] --pre installation from main branch.

ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
CMD []
CMD []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some EOF issue

@pull-request-size pull-request-size bot added size/L and removed size/M labels Aug 5, 2025
@athreesh athreesh closed this Aug 27, 2025
@athreesh athreesh deleted the anish/fix branch August 27, 2025 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants