feat: HF_ENDPOINT addition #2637

Michaelgathara · 2025-08-22T05:15:23Z

Overview:

This PR adds support for the HF_ENDPOINT environment variable in TRTLLM deployments. This feature allows users to specify custom HuggingFace endpoints (like mirrors or enterprise HF servers) when downloading models, which is especially useful in environments with restricted internet access or when using HF mirrors for better performance.

Details:

Added HF_ENDPOINT to the list of common environment variables in getCommonTRTLLMEnvVars()
The environment variable gets automatically included in the -x flags for mpirun
Updated all test cases to verify HF_ENDPOINT is properly forwarded in multinode mpirun commands

Where should the reviewer start?

deploy/cloud/operator/internal/dynamo/backend_trtllm.go - Main change in getCommonTRTLLMEnvVars() function
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go - Updated test expectations showing HF_ENDPOINT in mpirun commands

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: [FEATURE]: Dynamo should support HF_ENDPOINT while connects to huggingface. #2631

Summary by CodeRabbit

New Features
- Added support for configuring a custom Hugging Face Hub endpoint via HF_ENDPOINT.
- Automatically maps HF_ENDPOINT to HUGGINGFACE_HUB_ENDPOINT when unset for seamless hub access.
- Ensures HF_ENDPOINT is propagated to all distributed (MPI) processes for consistent behavior across nodes.
Tests
- Updated test cases to validate HF_ENDPOINT forwarding in various multi-node and role scenarios.

copy-pr-bot · 2025-08-22T05:15:27Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-08-22T05:15:34Z

👋 Hi Michaelgathara! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2025-08-22T05:17:03Z

Walkthrough

Adds HF_ENDPOINT to TRT-LLM environment propagation (operator and tests) and updates Rust hub logic to mirror HF_ENDPOINT into HUGGINGFACE_HUB_ENDPOINT when unset, leaving existing behavior otherwise unchanged.

Changes

Cohort / File(s)	Summary
TRT-LLM MPI env propagation `deploy/cloud/operator/internal/dynamo/backend_trtllm.go`, `deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go`	Include HF_ENDPOINT in common TRT-LLM env vars so mpirun forwards it (adds -x HF_ENDPOINT). Tests updated to expect the new flag across relevant scenarios.
HF endpoint bridging in hub `lib/llm/src/hub.rs`	In from_hf, if HF_ENDPOINT is set and HUGGINGFACE_HUB_ENDPOINT is not, set HUGGINGFACE_HUB_ENDPOINT to HF_ENDPOINT before building the API client. No public API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Env as Process Env
  participant Hub as from_hf()
  participant API as ApiBuilder

  Note over Hub,API: HF endpoint propagation (new)

  Hub->>Env: Read HF_TOKEN (existing)
  Hub->>Env: Read HF_ENDPOINT (new)
  alt HUGGINGFACE_HUB_ENDPOINT is unset AND HF_ENDPOINT is set
    Hub->>Env: Set HUGGINGFACE_HUB_ENDPOINT = HF_ENDPOINT (new)
  else
    Note over Hub: No change to HUGGINGFACE_HUB_ENDPOINT
  end
  Hub->>API: ApiBuilder::new() with env-derived config
  API-->>Hub: Client instance

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

fix(dynamo-llm): Use HF_TOKEN env var #1249 — Adjusts lib/llm/src/hub.rs from_hf to handle Hugging Face env vars; overlaps conceptually with this PR’s HF endpoint propagation.

Poem

A whisk of wind, a hop, a send—
We carry HF_ENDPOINT end-to-end.
mpirun sings with flags anew,
The hub now knows just what to do.
I twitch my nose, reviews are tight—
Small hops, clear paths, all set right. 🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.2.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

deploy/cloud/operator/internal/dynamo/backend_trtllm.go (1)
191-195: Also forward HUGGINGFACE_HUB_ENDPOINT to cover canonical var users

Many clients (including Rust hf-hub) look for HUGGINGFACE_HUB_ENDPOINT. Forwarding it too avoids surprises when users set only that var on the Pod. Alphabetical sorting will place it before HUGGING_FACE_HUB_TOKEN in the -x list.

Apply this diff:
 return map[string]bool{
-    "CUDA_VISIBLE_DEVICES": true, "MODEL_PATH": true, "HF_TOKEN": true, "HUGGING_FACE_HUB_TOKEN": true, "HF_ENDPOINT": true,
+    "CUDA_VISIBLE_DEVICES": true, "MODEL_PATH": true, "HF_TOKEN": true, "HUGGING_FACE_HUB_TOKEN": true, "HF_ENDPOINT": true,
+    "HUGGINGFACE_HUB_ENDPOINT": true,
     "TOKENIZERS_PARALLELISM": true, "NCCL_DEBUG": true, "NCCL_IB_DISABLE": true, "NCCL_P2P_DISABLE": true,
     "TENSORRT_LLM_CACHE_DIR": true, "HF_HOME": true, "TRANSFORMERS_CACHE": true, "HF_DATASETS_CACHE": true,
     "PATH": true, "LD_LIBRARY_PATH": true, "PYTHONPATH": true, "HOME": true, "USER": true,
 }
If you adopt this, I can update the expected strings in backend_trtllm_test.go to include -x HUGGINGFACE_HUB_ENDPOINT in the right sorted position.
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go (1)

63-63: Reduce test brittleness around exact mpirun env flag ordering (optional)

String-equality on the entire mpirun command is fragile whenever we extend the env allowlist. Consider asserting presence of key segments (e.g., contains "-x HF_ENDPOINT") or generating the expected env flags via formatEnvVarFlags(collectAllEnvVars(container.Env)) to keep order in sync.

I can draft a small helper to assemble expected env flags in tests so future env additions don’t require editing long literals.

Also applies to: 119-119, 566-566, 576-576, 594-594, 612-612, 630-630
lib/llm/src/hub.rs (1)
48-55: Avoid process-wide environment mutations by using ApiBuilder::with_endpoint
Setting a global environment variable at runtime can lead to surprising side effects in multithreaded contexts, since it applies across the entire process. The Tokio ApiBuilder already supports a custom endpoint via its with_endpoint method, so it’s safer to configure the client directly instead of mutating std::env.

File: lib/llm/src/hub.rs (lines 48–55)
• Replace the current env-var propagation block with builder configuration.

Suggested refactor:
-    // If HF_ENDPOINT is provided, propagate it to the canonical env var used by some clients
-    // to select an alternate Hugging Face hub endpoint. We only set it if not already present.
-    if let Ok(endpoint) = env::var(HF_ENDPOINT_ENV_VAR) {
-        if env::var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR).is_err() {
-            env::set_var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR, &endpoint);
-        }
-    }
+    // Prefer passing a custom endpoint directly to the client instead of mutating process-wide env.
+    let mut builder = ApiBuilder::new()
+        .with_progress(true)
+        .with_token(token);
+
+    if let Ok(endpoint) = std::env::var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR)
+        .or_else(|_| std::env::var(HF_ENDPOINT_ENV_VAR))
+    {
+        builder = builder.with_endpoint(endpoint);
+    }
+
+    let api = builder.high().build()?;
(Default endpoint is https://huggingface.co.) (docs.rs)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e2e909f and 08316e7.

📒 Files selected for processing (3)

deploy/cloud/operator/internal/dynamo/backend_trtllm.go (1 hunks)
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go (7 hunks)
lib/llm/src/hub.rs (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build and Test - dynamo
GitHub Check: pre-merge-rust (lib/runtime/examples)
GitHub Check: pre-merge-rust (.)
GitHub Check: pre-merge-rust (lib/bindings/python)

🔇 Additional comments (9)

deploy/cloud/operator/internal/dynamo/backend_trtllm.go (1)

191-195: HF_ENDPOINT propagation via mpirun looks good

Adding HF_ENDPOINT to the common env set ensures it’s forwarded with -x and unblocks custom HF mirrors in multinode TRT-LLM runs.

deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go (7)

63-63: Updated expectation includes -x HF_ENDPOINT — matches production sorting

The insertion point after -x HF_DATASETS_CACHE looks correct with the sorted env var flags.

119-119: HF_ENDPOINT forwarding asserted for LWS path — LGTM

Covers the alternate multinode deployer path.

566-566: Leader (args-first) case checks -x HF_ENDPOINT — LGTM

576-576: Leader (command-first, no GPUs) case asserts -x HF_ENDPOINT — LGTM

594-594: Leader (args take precedence) case includes -x HF_ENDPOINT — LGTM

612-612: Comprehensive env forwarding case updated to include -x HF_ENDPOINT — LGTM

630-630: Deduplication test continues to pass with -x HF_ENDPOINT in the set — LGTM

Still verifies that explicitly provided envs are merged and sorted once.

lib/llm/src/hub.rs (1)

29-31: Clear constant names for env vars — good addition

Names align with what operators and users expect.

deploy/cloud/operator/internal/dynamo/backend_trtllm.go

Signed-off-by: ayushag <ayushag@nvidia.com>

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>

Signed-off-by: nnshah1 <neelays@nvidia.com>

feat: HF_ENDPOINT addition

4376a85

Michaelgathara requested review from a team, biswapanda, hhzhang16, hutm, ishandhanani, julienmancuso, mohammedabdulwahhab and nnshah1 as code owners August 22, 2025 05:15

pull-request-size bot added the size/S label Aug 22, 2025

github-actions bot added the external-contribution Pull request is from an external contributor label Aug 22, 2025

Michaelgathara changed the title ~~[FEATURE]: HF_ENDPOINT addition~~ feat: HF_ENDPOINT addition Aug 22, 2025

github-actions bot added the feat label Aug 22, 2025

Merge branch 'main' into main

08316e7

coderabbitai bot reviewed Aug 22, 2025

View reviewed changes

Michaelgathara added 2 commits August 22, 2025 00:49

feat: added HUGGINGFACE_HUB_ENDPOINT per coderabbit

758f599

Merge branch 'main' of https://github.com/Michaelgathara/nvidia_dynamo

7da3351

julienmancuso reviewed Aug 22, 2025

View reviewed changes

deploy/cloud/operator/internal/dynamo/backend_trtllm.go Outdated Show resolved Hide resolved

Michaelgathara added 3 commits August 22, 2025 20:26

feat: removed deprecated var

3aa639a

feat: remove deprecated var'

4bd6efe

feat: removed useless space that I forgot

95acdca

julienmancuso approved these changes Aug 26, 2025

View reviewed changes

julienmancuso merged commit 45e38d3 into ai-dynamo:main Aug 26, 2025
8 checks passed

ayushag-nv pushed a commit that referenced this pull request Aug 27, 2025

feat: HF_ENDPOINT addition (#2637)

5e6a2e4

Signed-off-by: ayushag <ayushag@nvidia.com>

jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025

feat: HF_ENDPOINT addition (#2637)

955ad8e

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025

feat: HF_ENDPOINT addition (#2637)

3d48df6

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>

nnshah1 pushed a commit that referenced this pull request Sep 8, 2025

feat: HF_ENDPOINT addition (#2637)

9fe7d67

Signed-off-by: nnshah1 <neelays@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: HF_ENDPOINT addition #2637

feat: HF_ENDPOINT addition #2637

Uh oh!

Michaelgathara commented Aug 22, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: HF_ENDPOINT addition #2637

feat: HF_ENDPOINT addition #2637

Uh oh!

Conversation

Michaelgathara commented Aug 22, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Michaelgathara commented Aug 22, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 22, 2025 •

edited

Loading