-
Notifications
You must be signed in to change notification settings - Fork 679
feat: pre-built wheel for vllm image build #2487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe AMD64 path in install_vllm.sh now installs vLLM from a precompiled wheel tied to a commit, after installing PyTorch cu128. It sets and exports a wheel location variable, downloads the wheel to a temp dir, installs it with a precompiled flag, and cleans up. ARM64 remains unchanged. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant install_vllm.sh
participant PyTorchRepo as PyTorch (cu128)
participant WheelServer as vLLM Wheel Server
participant Pip
User->>install_vllm.sh: Run (AMD64)
install_vllm.sh->>PyTorchRepo: Install torch/cu128
install_vllm.sh->>install_vllm.sh: Build wheel URL from commit
install_vllm.sh->>WheelServer: Download vLLM wheel
install_vllm.sh->>Pip: pip install --precompiled-wheel <wheel>
install_vllm.sh->>install_vllm.sh: Cleanup temp wheel
install_vllm.sh-->>User: Done
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
container/deps/vllm/install_vllm.sh (2)
156-160: Editable install with precompiled wheel: confirm support or install the wheel directly.If vLLM’s build tooling supports
VLLM_USE_PRECOMPILED=1with-e ., this is fine. If not, install the wheel directly to avoid an inadvertent source build.Also, per prior learning,
--torch-backend=autoworks withuv pipand vLLM; consider using it to auto-match container CUDA.Option A (keep editable, switch to auto backend):
- VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND + VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=autoOption B (install wheel directly):
- VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND + VLLM_USE_PRECOMPILED=1 uv pip install "$VLLM_PRECOMPILED_WHEEL_LOCATION" --torch-backend=auto
161-161: Safer cleanup and avoid leftover directory.Use
rm -f --for the file, and optionally remove the temp directory to avoid clutter.- rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION + rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION" + rmdir --ignore-fail-on-non-empty "$VLLM_TEMP_DIR" 2>/dev/null || trueOptional: add a trap right after creating the temp file to guarantee cleanup on early exits:
trap 'rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION"' EXIT
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
container/deps/vllm/install_vllm.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
📚 Learning: 2025-07-21T00:10:56.947Z
Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.
Applied to files:
container/deps/vllm/install_vllm.sh
📚 Learning: 2025-07-22T10:22:28.972Z
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
Applied to files:
container/deps/vllm/install_vllm.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
container/deps/vllm/install_vllm.sh (2)
153-155: Wheel filename/tag assumptions: verify Python/GLIBC compatibility and bucket contents.The hard-coded wheel name
vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whlassumes:
- Python ABI is cp38-abi3 (works for 3.8+, only if built as abi3).
- Platform tag is manylinux1 (many projects now publish manylinux2014).
Confirm that:
- The container’s Python version is compatible with abi3, and
- The S3 path hosts exactly this name (including manylinux1; not manylinux2014).
If not, adjust tags or parameterize them.
149-160: Overall direction LGTM: commit-pinned precompiled wheel on AMD64 reduces build time/flake.Using a commit-specific wheel alongside the existing flow is a good move for reproducibility and speed.
container/deps/vllm/install_vllm.sh
Outdated
| fi | ||
| else | ||
| echo "Installing vllm for AMD64 architecture" | ||
| uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we confident in these - or can remove for now? the issue with the build was only due to the precompiled wheel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed we are using the same in ARM path as well so I thought of keeping this for amd_64 path.
since we already had issues with torch, I thought explicit is better than implicit
I dont have strong preference and I'll remove the torch pinning - please let me know your guidance
container/deps/vllm/install_vllm.sh
Outdated
| rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION || true | ||
| curl -fS --retry 3 -L "$REMOTE_WHEEL_URL" -o "$VLLM_PRECOMPILED_WHEEL_LOCATION" | ||
| if [ "$EDITABLE" = "true" ]; then | ||
| VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't want to use VLLM_USE_PRECOMPILED=1 - only VLLM_PRECOMPILED_WHEEL_LOCATION is needed and takes precedence -
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the doc. but this is source code ref
https://github.com/vllm-project/vllm/blob/ba81acbdc1eec643ba815a76628ae3e4b2263b76/setup.py#L639
seems like a nested check
|
/ok to test b778253 |
alec-flowers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going with #2489 as slightly simpler and pins openAI version which is a new issue that popped up.
Overview:
use commit specific
VLLM_PRECOMPILED_WHEEL_LOCATIONalong withVLLM_PRECOMPILEDSuccessful pipeline: 33422439
Summary by CodeRabbit