fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276

rmccorm4 · 2025-08-04T19:06:40Z

Overview:

As of a recent TRTLLM commit for v1.0.0, cache_transceiver_config is now a required field in the YAML config to enable disaggregation.
This PR adds this config to all disagg related config files in the trtllm config examples.
Fixes nvbugs/5430986, particularly this error when running disagg examples:

AssertionError: kv_cache_transceiver is disabled, please set 'cache_transceiver_config: backend:` in config file for disaggregated serving

Summary by CodeRabbit

New Features
- Added a new configuration section to multiple engine configuration files, enabling a default backend for cache transceiver functionality.
Style
- Added a newline at the end of one configuration file for improved formatting.

…onfig files

coderabbitai · 2025-08-04T19:10:55Z

Walkthrough

This update adds a new configuration section, cache_transceiver_config with backend: default, to multiple YAML engine configuration files across various model backends. Additionally, a newline is appended to one file. No existing keys, values, or logic are modified, and no code entities are changed.

Changes

Cohort / File(s)	Change Summary
Add `cache_transceiver_config` to DeepSeek MTP configs `components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml`, `components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml`	Appended `cache_transceiver_config` section with `backend: default` at the end of each YAML file.
Add `cache_transceiver_config` to DeepSeek Simple configs `components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml`, `components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml`	Appended `cache_transceiver_config` section with `backend: default` at the end of each YAML file.
Add `cache_transceiver_config` to DeepSeek Wide EP configs `components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml`, `components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml`	Appended `cache_transceiver_config` section with `backend: default` at the end of each YAML file.
Add `cache_transceiver_config` to Llama4 Eagle configs `components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml`, `components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml`	Appended `cache_transceiver_config` section with `backend: default` at the end of each YAML file.
Add `cache_transceiver_config` to Llama4 Eagle One Model configs `components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml`, `components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml`	Appended `cache_transceiver_config` section with `backend: default` at the end of each YAML file.
Newline addition `components/backends/trtllm/engine_configs/decode.yaml`	Added a newline at the end of the file; no configuration changes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant EngineConfig (YAML)
    participant Backend

    User->>EngineConfig: Loads configuration
    EngineConfig->>Backend: Reads cache_transceiver_config (backend: default)
    Backend-->>EngineConfig: Applies default backend for cache transceiver

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

fix: Fix message truncation in disagg flow #1572: Removes cache_transceiver_config from other YAML files to address message truncation, related by targeting the same configuration section but with opposite intent.

Poem

In YAML fields both wide and deep,
A cache config wakes from its sleep.
"Backend: default" in every file—
A simple change, a tidy style.
With every hop, configs align,
The rabbits cheer, "All works fine!" 🐇

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml (1)

65-66: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml (1)

36-37: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)

38-39: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml (1)

38-39: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

🧹 Nitpick comments (2)

components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)

41-45: Consider toggling print_iter_log for production runs

print_iter_log: true is handy for debugging but can flood logs in production. If you keep it enabled here, ensure downstream deployment templates override it when running at scale.

components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)

54-55: Consider disabling print_iter_log in production configs
Enabling per-iteration logging is great for debugging, but it can explode log volume and hurt performance at high token throughput. Unless this file is meant solely for debugging, flip this flag to false or document why verbose logging is required.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fed066 and abdeb7d.

📒 Files selected for processing (11)

components/backends/trtllm/engine_configs/decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1 hunks)
components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml (1 hunks)
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml (1 hunks)
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: trtllm llm-api expects all caps for backend field names in configuration files. when migrating trtll...

Learnt from: KrishnanPrash
PR: ai-dynamo/dynamo#2217
File: components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml:18-0
Timestamp: 2025-07-31T11:26:48.422Z
Learning: TRTLLM LLM-API expects all caps for backend field names in configuration files. When migrating TRTLLM configurations, backend values like "WideEP" should be changed to "WIDEEP" to comply with the API requirements.

Applied to files:

components/backends/trtllm/engine_configs/decode.yaml
components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml
components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml
components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml
components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml

📚 Learning: in examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content i...

Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:113-116
Timestamp: 2025-07-03T09:44:41.470Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content is acceptable because the config file is public, contains only placeholder replacements (no sensitive data), and provides debugging benefits for users who may want to create configurations based on the logged output.

Applied to files:

components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml
components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: pre-merge-rust (.)
GitHub Check: pre-merge-rust (lib/runtime/examples)
GitHub Check: pre-merge-rust (lib/bindings/python)
GitHub Check: Build and Test - vllm

🔇 Additional comments (6)

components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml (1)

58-60: Verify case-sensitivity of the backend value

Previous learnings note that TRTLLM’s config loader expects backend identifiers in ALL-CAPS (e.g., “WIDEEP”). The new field uses backend: default (lower-case). Please double-check whether default is accepted as-is or needs to be DEFAULT to avoid runtime assertion errors.

components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml (1)

40-41: Lower-case default is correct
A scan of existing TRT-LLM configs shows backend: default (all lowercase) used consistently for cache-transceiver settings (e.g. in decode.yaml, prefill.yaml under both top-level and deepseek_r1/mtp). No change required.

components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)

36-36: print_iter_log addition looks good

Helps with debugging and is consistent with other example configs.

components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)

41-45: Verify that backend: default matches TRT-LLM’s accepted enum values

Nice catch adding the mandatory cache_transceiver_config block. One small concern: previous migrations taught us that TRT-LLM sometimes expects backend values to be upper-cased (see the WIDEEP requirement in moe_config). If the transceiver back-end enumeration follows the same rule, "DEFAULT" could be required instead of "default".

Please double-check the TRT-LLM 1.0.0 docs or run a quick smoke test; if the loader is case-sensitive the engine will fail at start-up.
components/backends/trtllm/engine_configs/decode.yaml (1)

30-31: Verify required casing for backend value
Prior migration notes (see retrieved learnings) indicate that TRTLLM’s API expects backend identifiers in all caps (e.g., WIDEEP). Please double-check whether the transceiver accepts the lowercase string default or if it must be DEFAULT to avoid runtime assertion errors.
 cache_transceiver_config:
-  backend: default
+  backend: DEFAULT   # if the API enforces ALL-CAPS
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)

56-57: Verify uppercase requirement for cache_transceiver_config.backend “default”
We scanned all cache_transceiver_config blocks across engine configs—each currently uses lower-case default. Per past migration notes, TRTLLM’s LLM-API expects backend values to be uppercase. Please confirm whether lower-case default is accepted or switch to uppercase DEFAULT here (and in the corresponding prefill file) if required.

• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml: lines 56–57
• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml: lines 40–41

components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml

components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml

rmccorm4 · 2025-08-04T22:53:22Z

Going to cherry pick #2278 back to main after merge

Add default cache transceiver config to all prefill/decode (disagg) c…

abdeb7d

…onfig files

pull-request-size bot added the size/M label Aug 4, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 4, 2025 19:06 Inactive

github-actions bot added the fix label Aug 4, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 4, 2025 19:07 Inactive

coderabbitai bot reviewed Aug 4, 2025

View reviewed changes

components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml Show resolved Hide resolved

components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml Show resolved Hide resolved

rmccorm4 mentioned this pull request Aug 4, 2025

fix: Update disagg configs for trtllm 1.0.0rc4 changes (release/0.4.0) #2278

Merged

rmccorm4 changed the title ~~fix: Add default cache transceiver config to all prefill/decode (disagg) config files~~ fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) Aug 4, 2025

rmccorm4 closed this Aug 4, 2025

rmccorm4 deleted the rmccormick/cache_transceiver_config branch August 6, 2025 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276

fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276

Uh oh!

rmccorm4 commented Aug 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 4, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

rmccorm4 commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276

fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) #2276

Uh oh!

Conversation

rmccorm4 commented Aug 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 4, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rmccorm4 commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rmccorm4 commented Aug 4, 2025 •

edited by coderabbitai bot

Loading