-
Notifications
You must be signed in to change notification settings - Fork 692
docs: Remove TRTLLM_USE_NIXL_KVCACHE and TRTLLM_USE_UCX_KVCACHE environment variables
#2231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe documentation for enabling NIXL in KV cache transfer within disaggregated serving was simplified by removing environment variable configuration instructions. The remaining guidance now only reminds users to ensure ETCD and NATS services are running before starting the service. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
components/backends/trtllm/kv-cache-tranfer.md (2)
20-20: Filename is misspelled (“tranfer” → “transfer”)The markdown header is correct, but the file path itself (
kv-cache-tranfer.md) contains a typo. Renaming the file avoids broken links and 404s in downstream references.
64-64: Consider promoting the ETCD/NATS prerequisite into a separate prerequisite sectionThe single-line “Important” note can be missed easily. A short “Prerequisites” subsection (before the numbered steps) would improve readability.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/backends/trtllm/kv-cache-tranfer.md(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: GuanLuo
PR: ai-dynamo/dynamo#1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
Learnt from: dmitry-tokarev-nv
PR: ai-dynamo/dynamo#2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.
📚 Learning: in docs/support_matrix.md, the nixl version difference between runtime dependencies (0.5.0) and buil...
Learnt from: dmitry-tokarev-nv
PR: ai-dynamo/dynamo#2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.
Applied to files:
components/backends/trtllm/kv-cache-tranfer.md
📚 Learning: in multi-node setups with head/worker architecture, the head node typically doesn't need environment...
Learnt from: GuanLuo
PR: ai-dynamo/dynamo#1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
Applied to files:
components/backends/trtllm/kv-cache-tranfer.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
🔇 Additional comments (2)
components/backends/trtllm/kv-cache-tranfer.md (2)
34-35: Clarify architecture support statementThe note currently says NIXL is “only supported on AMD64 (x86_64)”. Please confirm if this is still accurate for
tensorrt-llm==1.0.0rc4; recent releases added experimental ARM builds in other components.
48-53: Confirm--trtllm-use-nixl-kvcache-experimentalremains requiredThe build script still defines and requires this flag—no default behavior has been added:
• container/build.sh (lines 167–171): Parses
--trtllm-use-nixl-kvcache-experimentaland errors if an argument is provided
• container/build.sh (lines 372–373): Lists it in help text as “Enables NIXL KVCACHE experimental support for TensorRT-LLM”The documentation is up-to-date; both flags are still needed to enable NIXL support.
|
@KrishnanPrash You should also remove the variable setting from here: https://github.com/ai-dynamo/dynamo/blob/main/container/Dockerfile.tensorrt_llm#L336-L353 |
|
Here are the other references to the environment variable in the dynamo repo. Do we need to remove/modify these as well?
|
…nvironment variables (#2231)
Overview:
Removing variables that are no longer necessary with
tensorrt_llm==1.0.0rc4Details:
Changes to documentation
Summary by CodeRabbit