-
Notifications
You must be signed in to change notification settings - Fork 690
fix: Address QA issues + product feedback #2202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Anant Sharma <anants@nvidia.com> Co-authored-by: Ishan Dhanani <idhanani@nvidia.com>
Resolved conflict in container/Dockerfile.sglang by keeping the specific flashinfer-python==0.2.9rc2 installation approach from the feature branch.
|
Caution Review failedFailed to post review comments. Configuration used: .coderabbit.yaml 📒 Files selected for processing (9)
🧰 Additional context used🧠 Learnings (7)📓 Common learningscomponents/backends/vllm/README.md (1)Learnt from: dmitry-tokarev-nv container/Dockerfile.sglang (1)Learnt from: grahamking components/backends/trtllm/README.md (1)Learnt from: dmitry-tokarev-nv components/backends/sglang/README.md (1)Learnt from: dmitry-tokarev-nv examples/README.md (1)Learnt from: PeaBrane README.md (2)Learnt from: dmitry-tokarev-nv Learnt from: biswapanda 🧬 Code Graph Analysis (1)components/backends/vllm/src/dynamo/vllm/args.py (1)
🪛 markdownlint-cli2 (0.17.2)README.md171-171: Fenced code blocks should have a language specified (MD040, fenced-code-language) 🔇 Additional comments (16)
WalkthroughThis update reorganizes and clarifies documentation for NVIDIA Dynamo and its backends, adjusts SGLang installation instructions, and introduces a new modular port allocation utility for vLLM. The vLLM backend now delegates port management to a shared module, supporting block allocation and explicit port ranges. Dockerfiles and example documentation are also refined. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI/User
participant vLLM Args Parser
participant Ports Module
participant ETCD
CLI/User->>vLLM Args Parser: Launch vLLM with port range args
vLLM Args Parser->>Ports Module: Request port/block allocation (with metadata)
Ports Module->>Ports Module: Bind/check local ports
Ports Module->>ETCD: Reserve port(s) with metadata
ETCD-->>Ports Module: Confirm reservation
Ports Module-->>vLLM Args Parser: Return allocated port(s)
vLLM Args Parser-->>CLI/User: Update config with allocated ports
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Remove the specific flashinfer-python installation and revert to the standard ai-dynamo[sglang] --pre installation from main branch.
|
|
||
| ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"] | ||
| CMD [] | ||
| CMD [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some EOF issue
This PR addresses QA issues and triggers PR refresh.
Changes:
Ready for review and merge into main.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores