-
Notifications
You must be signed in to change notification settings - Fork 690
feat: HF_ENDPOINT addition #2637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi Michaelgathara! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughAdds HF_ENDPOINT to TRT-LLM environment propagation (operator and tests) and updates Rust hub logic to mirror HF_ENDPOINT into HUGGINGFACE_HUB_ENDPOINT when unset, leaving existing behavior otherwise unchanged. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Env as Process Env
participant Hub as from_hf()
participant API as ApiBuilder
Note over Hub,API: HF endpoint propagation (new)
Hub->>Env: Read HF_TOKEN (existing)
Hub->>Env: Read HF_ENDPOINT (new)
alt HUGGINGFACE_HUB_ENDPOINT is unset AND HF_ENDPOINT is set
Hub->>Env: Set HUGGINGFACE_HUB_ENDPOINT = HF_ENDPOINT (new)
else
Note over Hub: No change to HUGGINGFACE_HUB_ENDPOINT
end
Hub->>API: ApiBuilder::new() with env-derived config
API-->>Hub: Client instance
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.2.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
deploy/cloud/operator/internal/dynamo/backend_trtllm.go (1)
191-195: Also forward HUGGINGFACE_HUB_ENDPOINT to cover canonical var usersMany clients (including Rust hf-hub) look for HUGGINGFACE_HUB_ENDPOINT. Forwarding it too avoids surprises when users set only that var on the Pod. Alphabetical sorting will place it before HUGGING_FACE_HUB_TOKEN in the -x list.
Apply this diff:
return map[string]bool{ - "CUDA_VISIBLE_DEVICES": true, "MODEL_PATH": true, "HF_TOKEN": true, "HUGGING_FACE_HUB_TOKEN": true, "HF_ENDPOINT": true, + "CUDA_VISIBLE_DEVICES": true, "MODEL_PATH": true, "HF_TOKEN": true, "HUGGING_FACE_HUB_TOKEN": true, "HF_ENDPOINT": true, + "HUGGINGFACE_HUB_ENDPOINT": true, "TOKENIZERS_PARALLELISM": true, "NCCL_DEBUG": true, "NCCL_IB_DISABLE": true, "NCCL_P2P_DISABLE": true, "TENSORRT_LLM_CACHE_DIR": true, "HF_HOME": true, "TRANSFORMERS_CACHE": true, "HF_DATASETS_CACHE": true, "PATH": true, "LD_LIBRARY_PATH": true, "PYTHONPATH": true, "HOME": true, "USER": true, }If you adopt this, I can update the expected strings in backend_trtllm_test.go to include -x HUGGINGFACE_HUB_ENDPOINT in the right sorted position.
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go (1)
63-63: Reduce test brittleness around exact mpirun env flag ordering (optional)String-equality on the entire mpirun command is fragile whenever we extend the env allowlist. Consider asserting presence of key segments (e.g., contains "-x HF_ENDPOINT") or generating the expected env flags via formatEnvVarFlags(collectAllEnvVars(container.Env)) to keep order in sync.
I can draft a small helper to assemble expected env flags in tests so future env additions don’t require editing long literals.
Also applies to: 119-119, 566-566, 576-576, 594-594, 612-612, 630-630
lib/llm/src/hub.rs (1)
48-55: Avoid process-wide environment mutations by using ApiBuilder::with_endpoint
Setting a global environment variable at runtime can lead to surprising side effects in multithreaded contexts, since it applies across the entire process. The TokioApiBuilderalready supports a custom endpoint via itswith_endpointmethod, so it’s safer to configure the client directly instead of mutatingstd::env.File: lib/llm/src/hub.rs (lines 48–55)
• Replace the current env-var propagation block with builder configuration.Suggested refactor:
- // If HF_ENDPOINT is provided, propagate it to the canonical env var used by some clients - // to select an alternate Hugging Face hub endpoint. We only set it if not already present. - if let Ok(endpoint) = env::var(HF_ENDPOINT_ENV_VAR) { - if env::var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR).is_err() { - env::set_var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR, &endpoint); - } - } + // Prefer passing a custom endpoint directly to the client instead of mutating process-wide env. + let mut builder = ApiBuilder::new() + .with_progress(true) + .with_token(token); + + if let Ok(endpoint) = std::env::var(HUGGINGFACE_HUB_ENDPOINT_ENV_VAR) + .or_else(|_| std::env::var(HF_ENDPOINT_ENV_VAR)) + { + builder = builder.with_endpoint(endpoint); + } + + let api = builder.high().build()?;(Default endpoint is
https://huggingface.co.) (docs.rs)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
deploy/cloud/operator/internal/dynamo/backend_trtllm.go(1 hunks)deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go(7 hunks)lib/llm/src/hub.rs(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - dynamo
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/bindings/python)
🔇 Additional comments (9)
deploy/cloud/operator/internal/dynamo/backend_trtllm.go (1)
191-195: HF_ENDPOINT propagation via mpirun looks goodAdding HF_ENDPOINT to the common env set ensures it’s forwarded with -x and unblocks custom HF mirrors in multinode TRT-LLM runs.
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go (7)
63-63: Updated expectation includes -x HF_ENDPOINT — matches production sortingThe insertion point after -x HF_DATASETS_CACHE looks correct with the sorted env var flags.
119-119: HF_ENDPOINT forwarding asserted for LWS path — LGTMCovers the alternate multinode deployer path.
566-566: Leader (args-first) case checks -x HF_ENDPOINT — LGTM
576-576: Leader (command-first, no GPUs) case asserts -x HF_ENDPOINT — LGTM
594-594: Leader (args take precedence) case includes -x HF_ENDPOINT — LGTM
612-612: Comprehensive env forwarding case updated to include -x HF_ENDPOINT — LGTM
630-630: Deduplication test continues to pass with -x HF_ENDPOINT in the set — LGTMStill verifies that explicitly provided envs are merged and sorted once.
lib/llm/src/hub.rs (1)
29-31: Clear constant names for env vars — good additionNames align with what operators and users expect.
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
Signed-off-by: nnshah1 <neelays@nvidia.com>
Overview:
This PR adds support for the
HF_ENDPOINTenvironment variable in TRTLLM deployments. This feature allows users to specify custom HuggingFace endpoints (like mirrors or enterprise HF servers) when downloading models, which is especially useful in environments with restricted internet access or when using HF mirrors for better performance.Details:
HF_ENDPOINTto the list of common environment variables ingetCommonTRTLLMEnvVars()Where should the reviewer start?
deploy/cloud/operator/internal/dynamo/backend_trtllm.go- Main change ingetCommonTRTLLMEnvVars()functiondeploy/cloud/operator/internal/dynamo/backend_trtllm_test.go- Updated test expectations showingHF_ENDPOINTin mpirun commandsRelated Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
HF_ENDPOINTwhile connects to huggingface. #2631Summary by CodeRabbit
New Features
Tests