Skip to content

Conversation

@biswapanda
Copy link
Contributor

@biswapanda biswapanda commented Aug 18, 2025

Overview:

use commit specific VLLM_PRECOMPILED_WHEEL_LOCATION along with VLLM_PRECOMPILED

Successful pipeline: 33422439

Screenshot 2025-08-17 at 11 00 32 PM

Summary by CodeRabbit

  • New Features
    • Use a precompiled vLLM wheel for AMD64 (x86_64) installs for faster setup.
    • Automatically installs compatible PyTorch (CUDA 12.8) on AMD64.
    • Introduces an environment variable to point to a custom precompiled wheel.
  • Chores
    • Reduces build time and lowers dependency requirements by avoiding source builds.
    • Cleans up the downloaded wheel after installation.
    • ARM64 installation path remains unchanged.

@biswapanda biswapanda self-assigned this Aug 18, 2025
@biswapanda biswapanda changed the title use vllm image pre-built image fix: use vllm image pre-built image Aug 18, 2025
@github-actions github-actions bot added the fix label Aug 18, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 18, 2025

Walkthrough

The AMD64 path in install_vllm.sh now installs vLLM from a precompiled wheel tied to a commit, after installing PyTorch cu128. It sets and exports a wheel location variable, downloads the wheel to a temp dir, installs it with a precompiled flag, and cleans up. ARM64 remains unchanged.

Changes

Cohort / File(s) Summary
AMD64 precompiled wheel workflow
container/deps/vllm/install_vllm.sh
Adds AMD64 logic to install PyTorch cu128, construct remote wheel URL from commit, export VLLM_PRECOMPILED_WHEEL_LOCATION, download wheel to temp dir, install via pip with precompiled flag, then remove wheel. ARM64 path unchanged.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant install_vllm.sh
  participant PyTorchRepo as PyTorch (cu128)
  participant WheelServer as vLLM Wheel Server
  participant Pip

  User->>install_vllm.sh: Run (AMD64)
  install_vllm.sh->>PyTorchRepo: Install torch/cu128
  install_vllm.sh->>install_vllm.sh: Build wheel URL from commit
  install_vllm.sh->>WheelServer: Download vLLM wheel
  install_vllm.sh->>Pip: pip install --precompiled-wheel <wheel>
  install_vllm.sh->>install_vllm.sh: Cleanup temp wheel
  install_vllm.sh-->>User: Done
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

I fetched a wheel across the wire,
No forge, no flame, just swift desire.
On x86 I spin and squeal,
A prebuilt hop—install made real.
Torch lit bright, temp trails fade,
A happy hare with faster trade. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
container/deps/vllm/install_vllm.sh (2)

156-160: Editable install with precompiled wheel: confirm support or install the wheel directly.

If vLLM’s build tooling supports VLLM_USE_PRECOMPILED=1 with -e ., this is fine. If not, install the wheel directly to avoid an inadvertent source build.

Also, per prior learning, --torch-backend=auto works with uv pip and vLLM; consider using it to auto-match container CUDA.

Option A (keep editable, switch to auto backend):

-        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND
+        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=auto

Option B (install wheel directly):

-        VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND
+        VLLM_USE_PRECOMPILED=1 uv pip install "$VLLM_PRECOMPILED_WHEEL_LOCATION" --torch-backend=auto

161-161: Safer cleanup and avoid leftover directory.

Use rm -f -- for the file, and optionally remove the temp directory to avoid clutter.

-    rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION
+    rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION"
+    rmdir --ignore-fail-on-non-empty "$VLLM_TEMP_DIR" 2>/dev/null || true

Optional: add a trap right after creating the temp file to guarantee cleanup on early exits:

trap 'rm -f -- "$VLLM_PRECOMPILED_WHEEL_LOCATION"' EXIT
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d177cdf and a8ca891.

📒 Files selected for processing (1)
  • container/deps/vllm/install_vllm.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
📚 Learning: 2025-07-21T00:10:56.947Z
Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.

Applied to files:

  • container/deps/vllm/install_vllm.sh
📚 Learning: 2025-07-22T10:22:28.972Z
Learnt from: ptarasiewiczNV
PR: ai-dynamo/dynamo#2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.

Applied to files:

  • container/deps/vllm/install_vllm.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
container/deps/vllm/install_vllm.sh (2)

153-155: Wheel filename/tag assumptions: verify Python/GLIBC compatibility and bucket contents.

The hard-coded wheel name vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl assumes:

  • Python ABI is cp38-abi3 (works for 3.8+, only if built as abi3).
  • Platform tag is manylinux1 (many projects now publish manylinux2014).

Confirm that:

  • The container’s Python version is compatible with abi3, and
  • The S3 path hosts exactly this name (including manylinux1; not manylinux2014).

If not, adjust tags or parameterize them.


149-160: Overall direction LGTM: commit-pinned precompiled wheel on AMD64 reduces build time/flake.

Using a commit-specific wheel alongside the existing flow is a good move for reproducibility and speed.

@biswapanda biswapanda changed the title fix: use vllm image pre-built image fix: use vllm image pre-built wheel for vllm image build Aug 18, 2025
@biswapanda biswapanda changed the title fix: use vllm image pre-built wheel for vllm image build feat: pre-built wheel for vllm image build Aug 18, 2025
@github-actions github-actions bot added feat and removed fix labels Aug 18, 2025
fi
else
echo "Installing vllm for AMD64 architecture"
uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we confident in these - or can remove for now? the issue with the build was only due to the precompiled wheel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed we are using the same in ARM path as well so I thought of keeping this for amd_64 path.
since we already had issues with torch, I thought explicit is better than implicit

I dont have strong preference and I'll remove the torch pinning - please let me know your guidance

rm -rf $VLLM_PRECOMPILED_WHEEL_LOCATION || true
curl -fS --retry 3 -L "$REMOTE_WHEEL_URL" -o "$VLLM_PRECOMPILED_WHEEL_LOCATION"
if [ "$EDITABLE" = "true" ]; then
VLLM_USE_PRECOMPILED=1 uv pip install -e . --torch-backend=$TORCH_BACKEND
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to use VLLM_USE_PRECOMPILED=1 - only VLLM_PRECOMPILED_WHEEL_LOCATION is needed and takes precedence -

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the doc. but this is source code ref
https://github.com/vllm-project/vllm/blob/ba81acbdc1eec643ba815a76628ae3e4b2263b76/setup.py#L639

seems like a nested check

@biswapanda
Copy link
Contributor Author

/ok to test b778253

@biswapanda biswapanda enabled auto-merge (squash) August 18, 2025 21:51
Copy link
Contributor

@alec-flowers alec-flowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going with #2489 as slightly simpler and pins openAI version which is a new issue that popped up.

@biswapanda biswapanda disabled auto-merge September 4, 2025 00:55
@biswapanda biswapanda closed this Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants