fix: add deepep and pplx pinning commit in vllm repo checkout #3168

nv-anants · 2025-09-22T19:00:19Z

Overview:

Failure - https://github.com/ai-dynamo/dynamo/actions/runs/17907241473/job/50910654042?pr=2969#step:8:280

seems like an issue in vLLM repo scripts dynamo uses this script to install ep-kernels - https://github.com/vllm-project/vllm/blob/v0.10.1.1/tools/ep_kernels/install_python_libraries.sh#L100
and they are not pinning pplx-kernels repo to any commit or tag, so likely this change in upstream repo is causing failures - perplexityai/pplx-kernels@2bd6e30

vLLM fixed pinning issue in later versions but not in the one dynamo uses - vllm-project/vllm@906e461

closes: OPS-1190

Signed-off-by: Anant Sharma <anants@nvidia.com>

coderabbitai · 2025-09-22T19:43:19Z

Walkthrough

Adds a nvshmem cherry-pick step to the VLLM install script. Introduces early VLLM_REF variable definition. Configures git committer identity, checks out the specified VLLM ref, cherry-picks commit 906e461ed6ddccd3cc7b68fa72048d2d3fcbd72c, and notes TODO to remove this when upgrading VLLM.

Changes

Cohort / File(s)	Summary
VLLM install script updates `container/deps/vllm/install_vllm.sh`	Define VLLM_REF earlier; add configured git committer; checkout VLLM ref; cherry-pick nvshmem-related commit `906e461...bd72c`; add comments/TODO regarding temporary fix and removal on next VLLM upgrade.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant S as Install Script
    participant G as Git
    participant R as VLLM Repo
    participant B as Build/Install

    S->>S: Define VLLM_REF
    S->>G: Configure committer name/email
    S->>G: Clone/Fetch VLLM
    S->>G: Checkout VLLM_REF
    Note over S,G: Temporary step for nvshmem fix
    S->>G: Cherry-pick commit 906e461…bd72c
    alt Cherry-pick succeeds
        S->>B: Proceed with build/install
        B-->>S: Installation complete
    else Cherry-pick fails
        S-->>S: Exit with error (manual resolution)
    end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

chore: updating to use pinned precompiled version #2489 — Also edits container/deps/vllm/install_vllm.sh; leverages VLLM_REF to construct a precompiled wheel URL, aligning with the ref handling introduced here.

Poem

A rabbit taps the shell with cheer,
“Cherry-pick this fix right here!”
VLLM set, the ref in line,
nvshmem patched—now all is fine.
I hop away, my job complete,
scripts compiled, installs neat. 🐇✨

Pre-merge checks

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The PR title claims to add a deepep and pplx pinning commit in the vllm checkout, but the changeset (container/deps/vllm/install_vllm.sh) and provided summary show a cherry-pick of an nvshmem-related commit and a VLLM_REF variable change with no evidence of deepep or pplx changes, so the title is misleading and does not reflect the actual primary change.	Update the title to accurately and concisely reflect the main change (for example: "fix: cherry-pick nvshmem commit into vllm checkout (temporary)" and optionally include the commit hash), and remove references to deepep/pplx unless those packages are actually modified in this PR.
Description Check	⚠️ Warning	The PR includes a clear Overview describing the CI failure and links to upstream commits, but it does not follow the repository's required description template: the "Details", "Where should the reviewer start?" and "Related Issues" sections are missing or left as placeholders and the description does not explicitly document the specific code changes made in this PR (files, commits, or lines to review). Because these required template sections and explicit change/testing details are absent, the pull request description does not meet the repository's standards. The Overview is useful but insufficient on its own to guide a reviewer or record the change for future reference.	Please update the PR to use the repository template: add a "Details" section that lists the exact changes (for example, the nvshmem cherry-pick and VLLM_REF change in container/deps/vllm/install_vllm.sh and the cherry-picked commit SHA), a "Where should the reviewer start?" section calling out files and lines to inspect, and a properly formatted "Related Issues" entry (e.g., closes # or link to OPS-1190). Also include testing performed, CI results, and any notes about the temporary nature and removal timeline for the cherry-pick; once those sections are added the description can be re-checked.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

container/deps/vllm/install_vllm.sh (2)
23-23: Fix misleading TODO label (“nvshmem”)

This cherry-pick is about pinning pplx (and deepep per PR title), not nvshmem. Update the comment to avoid confusion later.

Apply:
-# REMOVE nvshmem cherry-pick when moving to next version of vllm
+# REMOVE pplx pin cherry-pick when moving to next version of vllm
132-136: Make the cherry-pick idempotent and safer

Guard the cherry-pick so repeated builds don't fail if the commit is already present; set committer via git -c and keep -x for traceability. Verified: commit 906e461ed6ddccd3cc7b68fa72048d2d3fcbd72c is absent from REF 1da94e673c257373280026f75ceb4effac80e892.

File: container/deps/vllm/install_vllm.sh (lines 132-136)
-# nvshmem fix - cherry-pick commit pinning pplx version
-# https://github.com/ai-dynamo/dynamo/actions/runs/17907241473/job/50910654042?pr=2969#step:8:280
-# remove when moving to next version of vllm
-# Configure git user for cherry-pick operation
-GIT_COMMITTER_NAME="Container Build" GIT_COMMITTER_EMAIL="container@buildkitsandbox.local" git cherry-pick 906e461ed6ddccd3cc7b68fa72048d2d3fcbd72c
+# pplx pin: cherry-pick commit that pins pplx; remove on next vLLM bump
+# https://github.com/ai-dynamo/dynamo/actions/runs/17907241473/job/50910654042?pr=2969#step:8:280
+PPLX_PIN_COMMIT="906e461ed6ddccd3cc7b68fa72048d2d3fcbd72c"
+# Skip if already included in the checked-out ref
+if ! git merge-base --is-ancestor "${PPLX_PIN_COMMIT}" HEAD; then
+  git -c user.name="Container Build" -c user.email="container@buildkitsandbox.local" \
+      cherry-pick -x "${PPLX_PIN_COMMIT}"
+else
+  echo "pplx pin already present in ${VLLM_REF}; skipping cherry-pick"
+fi

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31c78df and 7ea7927.

📒 Files selected for processing (1)

container/deps/vllm/install_vllm.sh (2 hunks)

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-08-18T16:52:15.659Z

Learnt from: nnshah1
PR: ai-dynamo/dynamo#2489
File: container/deps/vllm/install_vllm.sh:151-152
Timestamp: 2025-08-18T16:52:15.659Z
Learning: The VLLM_PRECOMPILED_WHEEL_LOCATION environment variable, when exported, automatically triggers vLLM's build system to use the precompiled wheel instead of building from source, even when using standard `uv pip install .` commands in container/deps/vllm/install_vllm.sh.

Applied to files:

container/deps/vllm/install_vllm.sh

📚 Learning: 2025-08-18T16:52:15.659Z

Learnt from: nnshah1
PR: ai-dynamo/dynamo#2489
File: container/deps/vllm/install_vllm.sh:151-152
Timestamp: 2025-08-18T16:52:15.659Z
Learning: The VLLM_PRECOMPILED_WHEEL_LOCATION environment variable is an official vLLM environment variable that, when exported, automatically triggers vLLM's build system to use the specified precompiled wheel instead of building from source. This works even with standard `uv pip install .` commands without requiring explicit reference to the variable in the install command. The vLLM build system internally detects and uses this environment variable.

Applied to files:

container/deps/vllm/install_vllm.sh

📚 Learning: 2025-07-21T00:10:56.947Z

Learnt from: zaristei
PR: ai-dynamo/dynamo#2020
File: container/deps/vllm/install_vllm.sh:115-118
Timestamp: 2025-07-21T00:10:56.947Z
Learning: Graceful fallback for PyTorch wheel installation is broken on ARM architecture, so immediate exit on pinned version failure is preferred over fallback mechanisms in container/deps/vllm/install_vllm.sh for ARM64.

Applied to files:

container/deps/vllm/install_vllm.sh

dmitrytokarev

lgtm

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

Signed-off-by: Anant Sharma <anants@nvidia.com>

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Kyle H <kylhuang@nvidia.com>

fix: add deepep and pplx pinning commit in vllm repo checkout

7ea7927

Signed-off-by: Anant Sharma <anants@nvidia.com>

pull-request-size bot added the size/XS label Sep 22, 2025

github-actions bot added the fix label Sep 22, 2025

nv-anants marked this pull request as ready for review September 22, 2025 19:36

nv-anants requested review from a team as code owners September 22, 2025 19:36

coderabbitai bot reviewed Sep 22, 2025

View reviewed changes

dmitrytokarev approved these changes Sep 22, 2025

View reviewed changes

nv-anants enabled auto-merge (squash) September 22, 2025 19:49

dmitry-tokarev-nv approved these changes Sep 22, 2025

View reviewed changes

nv-anants merged commit 396755e into main Sep 22, 2025
14 of 15 checks passed

nv-anants deleted the anants/fix-vllm branch September 22, 2025 19:54

jasonqinzhou pushed a commit that referenced this pull request Sep 24, 2025

fix: add deepep and pplx pinning commit in vllm repo checkout (#3168)

b92567e

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

nv-anants added a commit that referenced this pull request Sep 24, 2025

fix: add deepep and pplx pinning commit in vllm repo checkout (#3168)

ce2487a

Signed-off-by: Anant Sharma <anants@nvidia.com>

jasonqinzhou pushed a commit that referenced this pull request Sep 24, 2025

fix: add deepep and pplx pinning commit in vllm repo checkout (#3168)

2198e36

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

kylehh pushed a commit that referenced this pull request Sep 25, 2025

fix: add deepep and pplx pinning commit in vllm repo checkout (#3168)

20f1c8e

Signed-off-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Kyle H <kylhuang@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add deepep and pplx pinning commit in vllm repo checkout #3168

fix: add deepep and pplx pinning commit in vllm repo checkout #3168

Uh oh!

nv-anants commented Sep 22, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 22, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

dmitrytokarev left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: add deepep and pplx pinning commit in vllm repo checkout #3168

fix: add deepep and pplx pinning commit in vllm repo checkout #3168

Uh oh!

Conversation

nv-anants commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Uh oh!

coderabbitai bot commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

dmitrytokarev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nv-anants commented Sep 22, 2025 •

edited

Loading

coderabbitai bot commented Sep 22, 2025 •

edited

Loading