docs: mermaid diagrams showcasing various KV router features #3184

PeaBrane · 2025-09-23T19:51:56Z

Overview:

Added mermaid diagrams to kv_cache_routing.md to showcase persistent KV events, radix snapshotting, and router replica syncing.

Summary by CodeRabbit

Documentation
- Rewrote the architecture overview to introduce two pillars: global persistent cache state and local active block management.
- Added diagrams explaining event flow and replica synchronization.
- Expanded guidance on routing decisions, differentiating persistent prefix blocks from ephemeral active blocks.
- Clarified options for enabling/disabling replica sync and described persistence and recovery behaviors.
- Removed outdated ASCII diagram and consolidated explanations for easier navigation.
- Improved narrative on how global and per-replica states interact to influence routing.

Signed-off-by: PeaBrane <yanrpei@gmail.com>

coderabbitai · 2025-09-23T19:57:12Z

Walkthrough

Rewrites the KV cache routing architecture doc to describe two layers: global persistent KV state via NATS JetStream and local per-router active block management with replica sync. Adds multiple Mermaid diagrams, clarifies flows, separates persistent prefix blocks from ephemeral active blocks, and removes an old ASCII diagram.

Changes

Cohort / File(s)	Summary of Changes
Docs: KV cache routing architecture `docs/architecture/kv_cache_routing.md`	Replaced Architecture section with Overview; added detailed descriptions of global KV state (JetStream, Object Store, durable consumers) and local slot management with replica sync; added Mermaid diagrams; removed legacy ASCII diagram; clarified persistent vs. ephemeral blocks and replica sync/persistence behavior.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Engine
  participant NATS as NATS JetStream
  participant Obj as NATS Object Store
  participant RouterA as Router Replica A
  participant RouterB as Router Replica B

  rect rgb(235, 245, 255)
    note over Engine,NATS: Global KV event publishing and consumption
    Engine->>NATS: Publish KV block events
    NATS-->>RouterA: Deliver to durable consumer
    NATS-->>RouterB: Deliver to durable consumer
    RouterA->>Obj: Periodic snapshot (prefix blocks)
    RouterB->>Obj: Periodic snapshot (prefix blocks)
  end

sequenceDiagram
  autonumber
  participant Client
  participant RouterA as Router Replica A
  participant RouterB as Router Replica B
  participant Core as NATS Core Messaging

  rect rgb(240, 255, 240)
    note over Client,RouterA: Local active block management timeline
    Client->>RouterA: Request received
    RouterA->>RouterA: Predict active blocks (t0)
    RouterA-->>Core: Broadcast active block update
    RouterB-->>Core: Broadcast own active blocks
    Core-->>RouterA: Replica updates (sync)
    RouterA->>RouterA: Adjust on first token (t1)
    RouterA->>RouterA: Finalize on completion (t2)
  end

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

I thump my paws on clustered ground,
Two layers hum—what tidy sound!
JetStream clouds keep blocks in line,
While local slots sync just in time.
Snapshots, tokens—hop, don’t lag—
A rabbit routes without a snag. 🐇✨

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The PR description contains only a brief Overview sentence and omits the required template sections 'Details', 'Where should the reviewer start?', and 'Related Issues', so it does not follow the repository's PR template and lacks actionable reviewer guidance.	Please expand the PR description to follow the template by adding a 'Details' section summarizing the specific edits (e.g., files changed and what each diagram illustrates), a 'Where should the reviewer start?' section that points to docs/architecture/kv_cache_routing.md, and a 'Related Issues' entry or explicit "none"; include any verification steps or screenshots if applicable.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The PR title succinctly and accurately describes the primary change: adding Mermaid diagrams that illustrate KV router features; it directly maps to modifications in docs/architecture/kv_cache_routing.md and is specific and readable for history scanning.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

docs/architecture/kv_cache_routing.md (3)
62-64: Naming consistency: KVIndexer → KvIndexer (matches code/docs elsewhere).

Use “KvIndexer” casing in the diagram labels to align with references in flags and code.

Apply this diff inside the mermaid block:
-        R1[Router 1<br/>KVIndexer]
-        R2[Router 2<br/>KVIndexer]
+        R1[Router 1<br/>KvIndexer]
+        R2[Router 2<br/>KvIndexer]
90-90: Capitalize NATS Core (proper noun).

Minor wording tweak for product name clarity.
-This is managed locally in each router via a "slot manager". To maintain consistency across the system, router replicas synchronize these local predictions with each other through NATS core messaging.
+This is managed locally in each router via a "slot manager". To maintain consistency across the system, router replicas synchronize these local predictions with each other through NATS Core messaging.
92-129: Clarify broadcast semantics in replica sync sequence.

The arrows suggest point-to-point; in practice routers publish to a shared subject and all replicas receive. Add a brief note to prevent misinterpretation.

Apply this small tweak in the sequence diagram:
-    Note over R1,R2: Router Replica Sync Enabled
+    Note over R1,R2: Router Replica Sync Enabled (pub-sub on shared subject; all replicas receive)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13a5d61 and d7f2174.

📒 Files selected for processing (1)

docs/architecture/kv_cache_routing.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#2756
File: lib/llm/src/kv_router/subscriber.rs:36-44
Timestamp: 2025-08-29T10:03:48.330Z
Learning: PeaBrane prefers to keep PRs contained in scope and is willing to defer technical improvements to future PRs when the current implementation works for the immediate use case. They acknowledge technical debt but prioritize deliverability over completeness in individual PRs.

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#3095
File: lib/llm/src/kv_router/subscriber.rs:200-223
Timestamp: 2025-09-17T20:55:41.392Z
Learning: In the dynamo codebase, PeaBrane prefers to maintain consistency with existing etcd key parsing patterns (like splitting on '/' and parsing the last segment) rather than introducing more robust parsing approaches, even when the current approach might be brittle, to keep the codebase aligned and avoid divergent patterns.

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#3095
File: lib/llm/src/kv_router/indexer.rs:0-0
Timestamp: 2025-09-17T20:55:06.313Z
Learning: When PeaBrane encounters a complex implementation issue that would significantly expand PR scope (like the remove_worker_sender method in lib/llm/src/kv_router/indexer.rs that required thread-safe map updates and proper shard targeting), they prefer to remove the problematic implementation entirely rather than rush a partial fix, deferring the proper solution to a future PR.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build and Test - trtllm
GitHub Check: Build and Test - vllm
GitHub Check: Build and Test - sglang
GitHub Check: Build and Test - dynamo

🔇 Additional comments (1)

docs/architecture/kv_cache_routing.md (1)

37-44: Nice addition: clear two-layer overview and context.

The new Overview and framing are clear and helpful. Please confirm the docs toolchain renders Mermaid fences (```mermaid) in this path so diagrams show up on the site.

docs/architecture/kv_cache_routing.md

Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: Kyle H <kylhuang@nvidia.com>

first commit

d7f2174

Signed-off-by: PeaBrane <yanrpei@gmail.com>

pull-request-size bot added the size/L label Sep 23, 2025

PeaBrane requested a review from alec-flowers September 23, 2025 19:52

github-actions bot added the docs label Sep 23, 2025

PeaBrane requested a review from rmccorm4 September 23, 2025 19:52

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

docs/architecture/kv_cache_routing.md Show resolved Hide resolved

tedzhouhk approved these changes Sep 24, 2025

View reviewed changes

PeaBrane merged commit d54f6fe into main Sep 24, 2025
16 checks passed

PeaBrane deleted the rupei/router-mermaids branch September 24, 2025 18:55

jasonqinzhou pushed a commit that referenced this pull request Sep 24, 2025

docs: mermaid diagrams showcasing various KV router features (#3184)

f091262

Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

jasonqinzhou pushed a commit that referenced this pull request Sep 24, 2025

docs: mermaid diagrams showcasing various KV router features (#3184)

f2d0132

Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

kylehh pushed a commit that referenced this pull request Sep 25, 2025

docs: mermaid diagrams showcasing various KV router features (#3184)

9143c74

Signed-off-by: PeaBrane <yanrpei@gmail.com> Signed-off-by: Kyle H <kylhuang@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: mermaid diagrams showcasing various KV router features #3184

docs: mermaid diagrams showcasing various KV router features #3184

Uh oh!

PeaBrane commented Sep 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 23, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: mermaid diagrams showcasing various KV router features #3184

docs: mermaid diagrams showcasing various KV router features #3184

Uh oh!

Conversation

PeaBrane commented Sep 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 23, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PeaBrane commented Sep 23, 2025 •

edited by coderabbitai bot

Loading