-
Notifications
You must be signed in to change notification settings - Fork 690
fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update adds a new configuration section, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant EngineConfig (YAML)
participant Backend
User->>EngineConfig: Loads configuration
EngineConfig->>Backend: Reads cache_transceiver_config (backend: default)
Backend-->>EngineConfig: Applies default backend for cache transceiver
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml (1)
65-66: Same casing concern as noted inmtp_prefill.yaml– please verify the backend value.components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml (1)
36-37: Same casing concern as noted inmtp_prefill.yaml– please verify the backend value.components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)
38-39: Same casing concern as noted inmtp_prefill.yaml– please verify the backend value.components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml (1)
38-39: Same casing concern as noted inmtp_prefill.yaml– please verify the backend value.
🧹 Nitpick comments (2)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)
41-45: Consider togglingprint_iter_logfor production runs
print_iter_log: trueis handy for debugging but can flood logs in production. If you keep it enabled here, ensure downstream deployment templates override it when running at scale.components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)
54-55: Consider disablingprint_iter_login production configs
Enabling per-iteration logging is great for debugging, but it can explode log volume and hurt performance at high token throughput. Unless this file is meant solely for debugging, flip this flag tofalseor document why verbose logging is required.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
components/backends/trtllm/engine_configs/decode.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml(1 hunks)components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml(1 hunks)components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml(1 hunks)components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml(1 hunks)components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml(1 hunks)components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: trtllm llm-api expects all caps for backend field names in configuration files. when migrating trtll...
Learnt from: KrishnanPrash
PR: ai-dynamo/dynamo#2217
File: components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml:18-0
Timestamp: 2025-07-31T11:26:48.422Z
Learning: TRTLLM LLM-API expects all caps for backend field names in configuration files. When migrating TRTLLM configurations, backend values like "WideEP" should be changed to "WIDEEP" to comply with the API requirements.
Applied to files:
components/backends/trtllm/engine_configs/decode.yamlcomponents/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yamlcomponents/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yamlcomponents/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yamlcomponents/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml
📚 Learning: in examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content i...
Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:113-116
Timestamp: 2025-07-03T09:44:41.470Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content is acceptable because the config file is public, contains only placeholder replacements (no sensitive data), and provides debugging benefits for users who may want to create configurations based on the logged output.
Applied to files:
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yamlcomponents/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: Build and Test - vllm
🔇 Additional comments (6)
components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml (1)
58-60: Verify case-sensitivity of thebackendvaluePrevious learnings note that TRTLLM’s config loader expects backend identifiers in ALL-CAPS (e.g., “WIDEEP”). The new field uses
backend: default(lower-case). Please double-check whetherdefaultis accepted as-is or needs to beDEFAULTto avoid runtime assertion errors.components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml (1)
40-41: Lower-casedefaultis correct
A scan of existing TRT-LLM configs showsbackend: default(all lowercase) used consistently for cache-transceiver settings (e.g. indecode.yaml,prefill.yamlunder both top-level and deepseek_r1/mtp). No change required.components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)
36-36:print_iter_logaddition looks goodHelps with debugging and is consistent with other example configs.
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)
41-45: Verify thatbackend: defaultmatches TRT-LLM’s accepted enum valuesNice catch adding the mandatory
cache_transceiver_configblock. One small concern: previous migrations taught us that TRT-LLM sometimes expects backend values to be upper-cased (see theWIDEEPrequirement inmoe_config). If the transceiver back-end enumeration follows the same rule,"DEFAULT"could be required instead of"default".Please double-check the TRT-LLM 1.0.0 docs or run a quick smoke test; if the loader is case-sensitive the engine will fail at start-up.
components/backends/trtllm/engine_configs/decode.yaml (1)
30-31: Verify required casing forbackendvalue
Prior migration notes (see retrieved learnings) indicate that TRTLLM’s API expects backend identifiers in all caps (e.g.,WIDEEP). Please double-check whether the transceiver accepts the lowercase stringdefaultor if it must beDEFAULTto avoid runtime assertion errors.cache_transceiver_config: - backend: default + backend: DEFAULT # if the API enforces ALL-CAPScomponents/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)
56-57: Verify uppercase requirement forcache_transceiver_config.backend“default”
We scanned allcache_transceiver_configblocks across engine configs—each currently uses lower-casedefault. Per past migration notes, TRTLLM’s LLM-API expects backend values to be uppercase. Please confirm whether lower-casedefaultis accepted or switch to uppercaseDEFAULThere (and in the corresponding prefill file) if required.• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml: lines 56–57
• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml: lines 40–41
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml
Show resolved
Hide resolved
components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml
Show resolved
Hide resolved
|
Going to cherry pick #2278 back to main after merge |
Overview:
cache_transceiver_configis now a required field in the YAML config to enable disaggregation.Summary by CodeRabbit
New Features
Style