feat: pre-built wheel for vllm image build #2487

biswapanda · 2025-08-18T03:58:23Z

Overview:

use commit specific VLLM_PRECOMPILED_WHEEL_LOCATION along with VLLM_PRECOMPILED

Successful pipeline: 33422439

Summary by CodeRabbit

New Features
- Use a precompiled vLLM wheel for AMD64 (x86_64) installs for faster setup.
- Automatically installs compatible PyTorch (CUDA 12.8) on AMD64.
- Introduces an environment variable to point to a custom precompiled wheel.
Chores
- Reduces build time and lowers dependency requirements by avoiding source builds.
- Cleans up the downloaded wheel after installation.
- ARM64 installation path remains unchanged.

coderabbitai · 2025-08-18T04:03:19Z

Walkthrough

The AMD64 path in install_vllm.sh now installs vLLM from a precompiled wheel tied to a commit, after installing PyTorch cu128. It sets and exports a wheel location variable, downloads the wheel to a temp dir, installs it with a precompiled flag, and cleans up. ARM64 remains unchanged.

Changes

Cohort / File(s)	Summary
AMD64 precompiled wheel workflow `container/deps/vllm/install_vllm.sh`	Adds AMD64 logic to install PyTorch cu128, construct remote wheel URL from commit, export VLLM_PRECOMPILED_WHEEL_LOCATION, download wheel to temp dir, install via pip with precompiled flag, then remove wheel. ARM64 path unchanged.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant install_vllm.sh
  participant PyTorchRepo as PyTorch (cu128)
  participant WheelServer as vLLM Wheel Server
  participant Pip

  User->>install_vllm.sh: Run (AMD64)
  install_vllm.sh->>PyTorchRepo: Install torch/cu128
  install_vllm.sh->>install_vllm.sh: Build wheel URL from commit
  install_vllm.sh->>WheelServer: Download vLLM wheel
  install_vllm.sh->>Pip: pip install --precompiled-wheel <wheel>
  install_vllm.sh->>install_vllm.sh: Cleanup temp wheel
  install_vllm.sh-->>User: Done

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

chore: Install vLLM and WideEP kernels in vLLM runtime container #2010 — Also modifies install_vllm.sh to handle AMD64 precompiled vLLM wheel installation with commit-based URL and export logic.

Poem

I fetched a wheel across the wire,
No forge, no flame, just swift desire.
On x86 I spin and squeal,
A prebuilt hop—install made real.
Torch lit bright, temp trails fade,
A happy hare with faster trade. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

container/deps/vllm/install_vllm.sh (2)
156-160: Editable install with precompiled wheel: confirm support or install the wheel directly.

If vLLM’s build tooling supports VLLM_USE_PRECOMPILED=1 with -e ., this is fine. If not, install the wheel directly to avoid an inadvertent source build.

Also, per prior learning, --torch-backend=auto works with uv pip and vLLM; consider using it to auto-match container CUDA.

Option A (keep editable, switch to auto backend):
-        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND
+        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=auto
Option B (install wheel directly):
-        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND
+        VLLM_USE_PRECOMPILED=1 uv pip install "$VLLM_PRECOMPILED_WHEEL_LOCATION" --torch-backend=auto
161-161: Safer cleanup and avoid leftover directory.

Use rm -f -- for the file, and optionally remove the temp directory to avoid clutter.
-    rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION
+    rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION"
+    rmdir --ignore-fail-on-non-empty "$VLLM_TEMP_DIR" 2>/dev/null || true
Optional: add a trap right after creating the temp file to guarantee cleanup on early exits:
trap 'rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION"' EXIT

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d177cdf and a8ca891.

📒 Files selected for processing (1)

container/deps/vllm/install_vllm.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

📓 Common learnings

Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.

Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.

📚 Learning: 2025-07-21T00:10:56.947Z

Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.

Applied to files:

container/deps/vllm/install_vllm.sh

📚 Learning: 2025-07-22T10:22:28.972Z

Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.

Applied to files:

container/deps/vllm/install_vllm.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (2)

container/deps/vllm/install_vllm.sh (2)

153-155: Wheel filename/tag assumptions: verify Python/GLIBC compatibility and bucket contents.

The hard-coded wheel name vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl assumes:

Python ABI is cp38-abi3 (works for 3.8+, only if built as abi3).

Platform tag is manylinux1 (many projects now publish manylinux2014).

Confirm that:

The container’s Python version is compatible with abi3, and

The S3 path hosts exactly this name (including manylinux1; not manylinux2014).

If not, adjust tags or parameterize them.

149-160: Overall direction LGTM: commit-pinned precompiled wheel on AMD64 reduces build time/flake.

Using a commit-specific wheel alongside the existing flow is a good move for reproducibility and speed.

container/deps/vllm/install_vllm.sh

nnshah1 · 2025-08-18T16:45:19Z

container/deps/vllm/install_vllm.sh

    fi
 else
    echo "Installing vllm for AMD64 architecture"
+    uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128


Are we confident in these - or can remove for now? the issue with the build was only due to the precompiled wheel

agreed we are using the same in ARM path as well so I thought of keeping this for amd_64 path.
since we already had issues with torch, I thought explicit is better than implicit

I dont have strong preference and I'll remove the torch pinning - please let me know your guidance

nnshah1 · 2025-08-18T16:46:15Z

container/deps/vllm/install_vllm.sh

+    rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION || true
+    curl -fS --retry 3 -L "$REMOTE_WHEEL_URL" -o "$VLLM_PRECOMPILED_WHEEL_LOCATION"
    if [ "$EDITABLE" = "true" ]; then
        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND


we don't want to use VLLM_USE_PRECOMPILED=1 - only VLLM_PRECOMPILED_WHEEL_LOCATION is needed and takes precedence -

I didn't find the doc. but this is source code ref
https://github.com/vllm-project/vllm/blob/ba81acbdc1eec643ba815a76628ae3e4b2263b76/setup.py#L639

seems like a nested check

container/deps/vllm/install_vllm.sh

biswapanda · 2025-08-18T21:22:30Z

/ok to test b778253

alec-flowers

Going with #2489 as slightly simpler and pins openAI version which is a new issue that popped up.

fix vllm image pre-built image

a8ca891

biswapanda requested review from a team, alec-flowers, ishandhanani, nnshah1, ptarasiewiczNV, richardhuo-nv, rmccorm4 and tanmayv25 as code owners August 18, 2025 03:58

pull-request-size bot added the size/XS label Aug 18, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 03:58 Inactive

biswapanda self-assigned this Aug 18, 2025

biswapanda changed the title ~~use vllm image pre-built image~~ fix: use vllm image pre-built image Aug 18, 2025

github-actions bot added the fix label Aug 18, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 03:59 Inactive

coderabbitai bot reviewed Aug 18, 2025

View reviewed changes

container/deps/vllm/install_vllm.sh Outdated Show resolved Hide resolved

fix vllm ref

1b8ed96

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 04:11 Inactive

fix

9f757f3

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 05:10 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 05:11 Inactive

fix

987e26f

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 05:13 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 05:16 Inactive

biswapanda changed the title ~~fix: use vllm image pre-built image~~ fix: use vllm image pre-built wheel for vllm image build Aug 18, 2025

biswapanda changed the title ~~fix: use vllm image pre-built wheel for vllm image build~~ feat: pre-built wheel for vllm image build Aug 18, 2025

github-actions bot added feat and removed fix labels Aug 18, 2025

nv-anants reviewed Aug 18, 2025

View reviewed changes

container/deps/vllm/install_vllm.sh Show resolved Hide resolved

nnshah1 reviewed Aug 18, 2025

View reviewed changes

container/deps/vllm/install_vllm.sh Show resolved Hide resolved

address comments

b778253

pull-request-size bot added size/S and removed size/XS labels Aug 18, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 21:17 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 18, 2025 21:18 Inactive

alec-flowers approved these changes Aug 18, 2025

View reviewed changes

nnshah1 approved these changes Aug 18, 2025

View reviewed changes

biswapanda enabled auto-merge (squash) August 18, 2025 21:51

alec-flowers requested changes Aug 18, 2025

View reviewed changes

biswapanda disabled auto-merge September 4, 2025 00:55

biswapanda closed this Sep 4, 2025

feat: pre-built wheel for vllm image build #2487

feat: pre-built wheel for vllm image build #2487

Uh oh!

Conversation

biswapanda commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 18, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nnshah1 Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

biswapanda Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

nnshah1 Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

biswapanda Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

biswapanda commented Aug 18, 2025

Uh oh!

alec-flowers left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

biswapanda commented Aug 18, 2025 •

edited

Loading