Skip to content

Conversation

@rmccorm4
Copy link
Contributor

@rmccorm4 rmccorm4 commented Aug 4, 2025

Overview:

  • As of a recent TRTLLM commit for v1.0.0, cache_transceiver_config is now a required field in the YAML config to enable disaggregation.
  • This PR adds this config to all disagg related config files in the trtllm config examples.
  • Fixes nvbugs/5430986, particularly this error when running disagg examples:
AssertionError: kv_cache_transceiver is disabled, please set 'cache_transceiver_config: backend:` in config file for disaggregated serving

Summary by CodeRabbit

  • New Features

    • Added a new configuration section to multiple engine configuration files, enabling a default backend for cache transceiver functionality.
  • Style

    • Added a newline at the end of one configuration file for improved formatting.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 4, 2025

Walkthrough

This update adds a new configuration section, cache_transceiver_config with backend: default, to multiple YAML engine configuration files across various model backends. Additionally, a newline is appended to one file. No existing keys, values, or logic are modified, and no code entities are changed.

Changes

Cohort / File(s) Change Summary
Add cache_transceiver_config to DeepSeek MTP configs
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml, components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml
Appended cache_transceiver_config section with backend: default at the end of each YAML file.
Add cache_transceiver_config to DeepSeek Simple configs
components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml, components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml
Appended cache_transceiver_config section with backend: default at the end of each YAML file.
Add cache_transceiver_config to DeepSeek Wide EP configs
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml, components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
Appended cache_transceiver_config section with backend: default at the end of each YAML file.
Add cache_transceiver_config to Llama4 Eagle configs
components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml, components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml
Appended cache_transceiver_config section with backend: default at the end of each YAML file.
Add cache_transceiver_config to Llama4 Eagle One Model configs
components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml, components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml
Appended cache_transceiver_config section with backend: default at the end of each YAML file.
Newline addition
components/backends/trtllm/engine_configs/decode.yaml
Added a newline at the end of the file; no configuration changes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant EngineConfig (YAML)
    participant Backend

    User->>EngineConfig: Loads configuration
    EngineConfig->>Backend: Reads cache_transceiver_config (backend: default)
    Backend-->>EngineConfig: Applies default backend for cache transceiver
Loading

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

In YAML fields both wide and deep,
A cache config wakes from its sleep.
"Backend: default" in every file—
A simple change, a tidy style.
With every hop, configs align,
The rabbits cheer, "All works fine!" 🐇

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml (1)

65-66: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml (1)

36-37: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)

38-39: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml (1)

38-39: Same casing concern as noted in mtp_prefill.yaml – please verify the backend value.

🧹 Nitpick comments (2)
components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)

41-45: Consider toggling print_iter_log for production runs

print_iter_log: true is handy for debugging but can flood logs in production. If you keep it enabled here, ensure downstream deployment templates override it when running at scale.

components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)

54-55: Consider disabling print_iter_log in production configs
Enabling per-iteration logging is great for debugging, but it can explode log volume and hurt performance at high token throughput. Unless this file is meant solely for debugging, flip this flag to false or document why verbose logging is required.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fed066 and abdeb7d.

📒 Files selected for processing (11)
  • components/backends/trtllm/engine_configs/decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml (1 hunks)
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: trtllm llm-api expects all caps for backend field names in configuration files. when migrating trtll...
Learnt from: KrishnanPrash
PR: ai-dynamo/dynamo#2217
File: components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml:18-0
Timestamp: 2025-07-31T11:26:48.422Z
Learning: TRTLLM LLM-API expects all caps for backend field names in configuration files. When migrating TRTLLM configurations, backend values like "WideEP" should be changed to "WIDEEP" to comply with the API requirements.

Applied to files:

  • components/backends/trtllm/engine_configs/decode.yaml
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_decode.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_decode.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml
  • components/backends/trtllm/engine_configs/llama4/eagle/eagle_prefill.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
  • components/backends/trtllm/engine_configs/llama4/eagle_one_model/eagle_prefill.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml
📚 Learning: in examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content i...
Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:113-116
Timestamp: 2025-07-03T09:44:41.470Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, logging the full configuration file content is acceptable because the config file is public, contains only placeholder replacements (no sensitive data), and provides debugging benefits for users who may want to create configurations based on the logged output.

Applied to files:

  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml
  • components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: pre-merge-rust (.)
  • GitHub Check: pre-merge-rust (lib/runtime/examples)
  • GitHub Check: pre-merge-rust (lib/bindings/python)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (6)
components/backends/trtllm/engine_configs/deepseek_r1/simple/decode.yaml (1)

58-60: Verify case-sensitivity of the backend value

Previous learnings note that TRTLLM’s config loader expects backend identifiers in ALL-CAPS (e.g., “WIDEEP”). The new field uses backend: default (lower-case). Please double-check whether default is accepted as-is or needs to be DEFAULT to avoid runtime assertion errors.

components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml (1)

40-41: Lower-case default is correct
A scan of existing TRT-LLM configs shows backend: default (all lowercase) used consistently for cache-transceiver settings (e.g. in decode.yaml, prefill.yaml under both top-level and deepseek_r1/mtp). No change required.

components/backends/trtllm/engine_configs/deepseek_r1/simple/prefill.yaml (1)

36-36: print_iter_log addition looks good

Helps with debugging and is consistent with other example configs.

components/backends/trtllm/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml (1)

41-45: Verify that backend: default matches TRT-LLM’s accepted enum values

Nice catch adding the mandatory cache_transceiver_config block. One small concern: previous migrations taught us that TRT-LLM sometimes expects backend values to be upper-cased (see the WIDEEP requirement in moe_config). If the transceiver back-end enumeration follows the same rule, "DEFAULT" could be required instead of "default".

Please double-check the TRT-LLM 1.0.0 docs or run a quick smoke test; if the loader is case-sensitive the engine will fail at start-up.

components/backends/trtllm/engine_configs/decode.yaml (1)

30-31: Verify required casing for backend value
Prior migration notes (see retrieved learnings) indicate that TRTLLM’s API expects backend identifiers in all caps (e.g., WIDEEP). Please double-check whether the transceiver accepts the lowercase string default or if it must be DEFAULT to avoid runtime assertion errors.

 cache_transceiver_config:
-  backend: default
+  backend: DEFAULT   # if the API enforces ALL-CAPS
components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml (1)

56-57: Verify uppercase requirement for cache_transceiver_config.backend “default”
We scanned all cache_transceiver_config blocks across engine configs—each currently uses lower-case default. Per past migration notes, TRTLLM’s LLM-API expects backend values to be uppercase. Please confirm whether lower-case default is accepted or switch to uppercase DEFAULT here (and in the corresponding prefill file) if required.

• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_decode.yaml: lines 56–57
• components/backends/trtllm/engine_configs/deepseek_r1/mtp/mtp_prefill.yaml: lines 40–41

@rmccorm4 rmccorm4 changed the title fix: Add default cache transceiver config to all prefill/decode (disagg) config files fix: Update disagg configs for trtllm 1.0.0rc4 changes (main) Aug 4, 2025
@rmccorm4
Copy link
Contributor Author

rmccorm4 commented Aug 4, 2025

Going to cherry pick #2278 back to main after merge

@rmccorm4 rmccorm4 closed this Aug 4, 2025
@rmccorm4 rmccorm4 deleted the rmccormick/cache_transceiver_config branch August 6, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants