feat: Remove Duplicate Multimodal Nixl Connect Example #2622

whoisj · 2025-08-21T21:34:03Z

This change:

Updates the Multimodal example to use the built-in dynamo.nixl_connector classes instead of the local connect/ folder.
Removes the deprecated connect/ folder from the multimodal example.

DIS-223
Issue 2481

Fixes #2481

Summary by CodeRabbit

Refactor
- Examples now use the centralized RDMA connector; initialization simplified.
- Protocol models updated to use the new RdmaMetadata type for serialized_request.
- Encode worker now reads request metadata from stream metadata to align with the new API.
Documentation
- Removed outdated README for the legacy connect module.
Chores
- Cleaned up PYTHONPATH handling in the local development container image.

copy-pr-bot · 2025-08-21T21:34:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-08-21T21:34:11Z

👋 Hi whoisj! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

indrajit96

LGTM!

container/Dockerfile.vllm

This change updates the Multimodal example to use the built-in `dynamo.nixl_connector` classes instead of the local `connect/` folder.

This change removes the deprecated `connect/` folder from the multimodal example. The multimodal example was migrated to the new `dynamo.nixl_connector` code in the previous commit.

coderabbitai · 2025-08-21T21:47:32Z

Caution

Review failed

Failed to post review comments.

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cc48dd3 and 90b1ec1.

📒 Files selected for processing (6)

container/Dockerfile.vllm (1 hunks)
examples/multimodal/components/encode_worker.py (3 hunks)
examples/multimodal/components/worker.py (2 hunks)
examples/multimodal/connect/README.md (0 hunks)
examples/multimodal/connect/__init__.py (0 hunks)
examples/multimodal/utils/protocol.py (3 hunks)

💤 Files with no reviewable changes (2)

examples/multimodal/connect/README.md
examples/multimodal/connect/init.py

🧰 Additional context used

🧬 Code graph analysis (3)

examples/multimodal/utils/protocol.py (1)

lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)

RdmaMetadata (1417-1451)

examples/multimodal/components/worker.py (1)

lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)

Connector (501-720)

examples/multimodal/components/encode_worker.py (1)

lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)

Connector (501-720)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2622/merge) by whoisj.

examples/multimodal/utils/protocol.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

examples/multimodal/components/worker.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

examples/multimodal/components/encode_worker.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

Walkthrough

Refactors multimodal examples to use dynamo.nixl_connect instead of a local connect module, removes the local connect package and its README, updates protocol types from SerializedRequest to RdmaMetadata, adjusts encode_worker metadata source, and tweaks Dockerfile PYTHONPATH to avoid duplication.

Changes

Cohort / File(s)	Summary of changes
Dockerfile env tweak `container/Dockerfile.vllm`	Simplified PYTHONPATH export to append planner/src once; removed duplicate $PYTHONPATH reference.
Switch to dynamo.nixl_connect `examples/multimodal/components/worker.py`, `examples/multimodal/components/encode_worker.py`	Replaced local connect import with `dynamo.nixl_connect as connect`. Connector constructed with `Connector()` (no args) instead of `Connector(runtime=..., namespace=...)`.
Encoded request metadata source `examples/multimodal/components/encode_worker.py`	`request.serialized_request` now taken from `readable.metadata()` instead of `readable.to_serialized()`.
Remove local connect package `examples/multimodal/connect/README.md`, `examples/multimodal/connect/__init__.py`	Deleted README and implementation providing local NIXL-based RDMA API (Connector, operations, descriptors, enums).
Protocol type updates `examples/multimodal/utils/protocol.py`	Updated import to `dynamo.nixl_connect`. Replaced `SerializedRequest` types with `RdmaMetadata` in vLLMMultimodalRequest, EncodeRequest, and EncodeResponse.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant W as Worker (encode/worker)
  participant C as dynamo.nixl_connect.Connector
  participant R as Readable Buffer

  Note over W,C: Initialization flow
  W->>C: Connector()
  W->>C: initialize(runtime, namespace)

  Note over W,R: Encode path metadata production
  W->>R: create_readable(...)
  R-->>W: buffer handle
  W->>R: metadata()
  R-->>W: RdmaMetadata
  W->>W: request.serialized_request = RdmaMetadata

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

docs: Add README for Connect Library #1303 — Touches examples/multimodal/connect module and its README; overlaps directly with the removal and rewiring of this module.
feat: Connect Library #1478 — Introduces/renames RDMA connect types (e.g., SerializedRequest → RdmaMetadata) aligning with the type changes and externalization in this PR.
feat: add vLLM v1 multi-modal example. Add llama4 Maverick example #1990 — Adds/changes multimodal connect implementation and types; directly related to replacing local connect with dynamo.nixl_connect.

Poem

A whisk of wires, a hop of code,
I nibbled docs and lightened load.
New roads to RDMA I tread,
With fresher names to forge ahead—
PYTHONPATH trimmed, I leap with glee,
Connector’s sleek as I can be.
— a happy dev rabbit 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

krishung5 · 2025-08-21T22:12:55Z

LGTM as well! Please make sure the vllm multi-gpu test passed on GitLab. (Multi-gpu test will not run on GitHub).

rmccorm4 · 2025-08-21T22:50:22Z

/ok to test 8e8726d

rmccorm4

LGTM. Just triggered gitlab tests per Kris's comment: #2622 (comment)

rmccorm4 · 2025-08-22T17:16:32Z

There's so much noise in the log it's hard for me to tell, but if I'm reading it right it seems like the multimodal related parts actually passed:

[TEST] 2025-08-21T23:52:04 INFO test_serve_deployment[multimodal_agg]: Response<Response [200]>
[TEST] 2025-08-21T23:52:04 INFO test_serve_deployment[multimodal_agg]: Received Content:  The image features a large bus parked on a street, with a "Out of Service" sign displayed on its front.Ъ. The bus is positioned near a tree, and there are a few traffic lights in the vicinity. The scene appears to be set in a city, with the bus possibly waiting for its next assignment or being temporarily taken out of service.

There are ton of (what look like red herring) failures like this, I'm assuming they are sending inference requests as a "health"/"ready" check:

2025-08-21T23:51:57.230907Z ERROR encode_worker.generate: Error processing request 9ac173db104542d4b3069c96d97c6e40: no instances found for endpoint "instances/dynamo/llm/generate"   
[BASH] 
[BASH] Traceback (most recent call last):
[BASH]   File "/workspace/examples/multimodal/components/encode_worker.py", line 159, in generate
[BASH]     response_generator = await self.pd_worker_client.round_robin(
[BASH]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[BASH] Exception: no instances found for endpoint "instances/dynamo/llm/generate"
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] Error in chat completion stream generator.
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] Traceback (most recent call last):
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]   File "/opt/vllm/vllm/entrypoints/openai/serving_chat.py", line 545, in chat_completion_stream_generator
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]     async for res in result_generator:
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]   File "/workspace/examples/multimodal/components/processor.py", line 185, in _generate_responses
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]     async for resp in response_generator:
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] ValueError: a python exception was caught while processing the async generator: Exception: no instances found for endpoint "instances/dynamo/llm/generate"
[BASH] 2025-08-21T23:51:57.231950Z  WARN http-request: dynamo_runtime::pipeline::network::egress::addressed_router: Failed deserializing JSON to response err=missing field `id` at line 1 column 233 json_str={"data":{"data":{"error":{"code":400,"message":"a python exception was caught while processing the async generator: Exception: no instances found for endpoint \"instances/dynamo/llm/generate\"","param":null,"type":"BadRequestError"}}},"complete_final":false} method=POST uri=/v1/chat/completions version=HTTP/1.1
[BASH] 2025-08-21T23:51:57.231995Z ERROR http-request: dynamo_llm::http::service::openai: Failed to fold chat completions stream for: "missing field `id` at line 1 column 233" request_id="52c184fa-1959-46da-9697-5409123d809c" method=POST uri=/v1/chat/completions version=HTTP/1.1
[BASH] 2025-08-21T23:51:57.232005Z ERROR http-request: dynamo_llm::http::service::openai: Internal server error: Failed to fold chat completions stream: missing field `id` at line 1 column 233 method=POST uri=/v1/chat/completions version=HTTP/1.1

Not for this PR, but we need to reduce the noise from the logs to make it easier to read. Some ideas:

Switch to /health endpoint over sending inference requests, and check its output to make sure all expected endpoints have spun up.
I see some DeepEP/IBGDA failures - I think I saw @krishung5 add an H100 marker for these tests. Is it still running on A100s as well?

[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/host/transport/transport.cpp:nvshmemi_transport_init:275: init failed for transport: IBGDA
[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/modules/transport/ibgda/ibgda.cpp:nvshmemt_init:3632: neither nv_peer_mem, or nvidia_peermem detected. Skipping transport.
[BASH] 
[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/host/transport/transport.cpp:nvshmemi_transport_init:275: init failed for transport: IBGDA

I think we can demote all these types of lines to a log file or something instead of stdout:

INFO     VLLMProcess:managed_process.py:31 Terminating PID: 1027 name: python

krishung5 · 2025-08-22T17:29:44Z

Thanks @rmccorm4 for the summary.
I checked the test as well, and it looks like multimodal case is indeed passing.
For 1, agree that we can switch to the health checkpoint. You're right about the error logs, inferences are sent for checking if the server is ready. By using the health checkpoint the log should be cleaner.
For 2, There's a GitLab MR for the fix that was merged after the pipeline gets triggered, so it's running the deepep test. We can ignore that one for now.
3. agree!

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>

Signed-off-by: nnshah1 <neelays@nvidia.com>

whoisj requested review from a team, alec-flowers, hhzhang16, indrajit96, ishandhanani, krishung5, nnshah1, ptarasiewiczNV, richardhuo-nv, rmccorm4 and tanmayv25 as code owners August 21, 2025 21:34

pull-request-size bot added the size/XXL label Aug 21, 2025

github-actions bot added feat external-contribution Pull request is from an external contributor labels Aug 21, 2025

whoisj removed external-contribution Pull request is from an external contributor size/XXL labels Aug 21, 2025

pull-request-size bot added the size/XXL label Aug 21, 2025

indrajit96 approved these changes Aug 21, 2025

View reviewed changes

container/Dockerfile.vllm Show resolved Hide resolved

whoisj force-pushed the whoisj/nixl_connector/migrate branch from 90b1ec1 to e36eb9f Compare August 21, 2025 21:41

whoisj added 2 commits August 21, 2025 17:43

feat: Migrate Multimodal Example to dynamo.nixl_connector

3b8515c

This change updates the Multimodal example to use the built-in `dynamo.nixl_connector` classes instead of the local `connect/` folder.

feat: Remove Multimodel Example Connect

8e8726d

This change removes the deprecated `connect/` folder from the multimodal example. The multimodal example was migrated to the new `dynamo.nixl_connector` code in the previous commit.

whoisj force-pushed the whoisj/nixl_connector/migrate branch from e36eb9f to 8e8726d Compare August 21, 2025 21:43

whoisj mentioned this pull request Aug 21, 2025

[BUG]: Remove duplicate nixl_connect implementations #2481

Closed

copy-pr-bot bot temporarily deployed to GITLAB August 21, 2025 22:50 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 21, 2025 22:55 Inactive

rmccorm4 approved these changes Aug 21, 2025

View reviewed changes

rmccorm4 merged commit 0a71aea into ai-dynamo:main Aug 22, 2025
11 of 13 checks passed

rmccorm4 changed the title ~~feat: Remove Multimodel Example Connect~~ feat: Remove Duplicate Multimodal Nixl Connect Example Aug 22, 2025

whoisj deleted the whoisj/nixl_connector/migrate branch August 25, 2025 16:25

hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025

feat: Remove Duplicate Multimodel Nixl Connect Example (#2622)

cf2006b

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>

nv-anants pushed a commit that referenced this pull request Aug 28, 2025

feat: Remove Duplicate Multimodel Nixl Connect Example (#2622)

0ff8499

jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025

feat: Remove Duplicate Multimodel Nixl Connect Example (#2622)

dc49722

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025

feat: Remove Duplicate Multimodel Nixl Connect Example (#2622)

f2a5e25

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>

nnshah1 pushed a commit that referenced this pull request Sep 8, 2025

feat: Remove Duplicate Multimodel Nixl Connect Example (#2622)

8b915c6

Signed-off-by: nnshah1 <neelays@nvidia.com>

feat: Remove Duplicate Multimodal Nixl Connect Example #2622

feat: Remove Duplicate Multimodal Nixl Connect Example #2622

Uh oh!

Conversation

whoisj commented Aug 21, 2025 • edited by rmccorm4 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Aug 21, 2025

Uh oh!

github-actions bot commented Aug 21, 2025

Uh oh!

indrajit96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

krishung5 commented Aug 21, 2025

Uh oh!

rmccorm4 commented Aug 21, 2025

Uh oh!

rmccorm4 left a comment

Choose a reason for hiding this comment

Uh oh!

rmccorm4 commented Aug 22, 2025

Uh oh!

krishung5 commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

whoisj commented Aug 21, 2025 •

edited by rmccorm4

Loading

coderabbitai bot commented Aug 21, 2025 •

edited

Loading