Skip to content

Conversation

@whoisj
Copy link
Collaborator

@whoisj whoisj commented Aug 21, 2025

This change:

  • Updates the Multimodal example to use the built-in dynamo.nixl_connector classes instead of the local connect/ folder.
  • Removes the deprecated connect/ folder from the multimodal example.

DIS-223
Issue 2481

Fixes #2481

Summary by CodeRabbit

  • Refactor

    • Examples now use the centralized RDMA connector; initialization simplified.
    • Protocol models updated to use the new RdmaMetadata type for serialized_request.
    • Encode worker now reads request metadata from stream metadata to align with the new API.
  • Documentation

    • Removed outdated README for the legacy connect module.
  • Chores

    • Cleaned up PYTHONPATH handling in the local development container image.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 21, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi whoisj! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added feat external-contribution Pull request is from an external contributor labels Aug 21, 2025
@whoisj whoisj removed external-contribution Pull request is from an external contributor size/XXL labels Aug 21, 2025
Copy link
Contributor

@indrajit96 indrajit96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@whoisj whoisj force-pushed the whoisj/nixl_connector/migrate branch from 90b1ec1 to e36eb9f Compare August 21, 2025 21:41
whoisj added 2 commits August 21, 2025 17:43
This change updates the Multimodal example to use the built-in `dynamo.nixl_connector` classes instead of the local `connect/` folder.
This change removes the deprecated `connect/` folder from the multimodal example.

The multimodal example was migrated to the new `dynamo.nixl_connector` code in the previous commit.
@whoisj whoisj force-pushed the whoisj/nixl_connector/migrate branch from e36eb9f to 8e8726d Compare August 21, 2025 21:43
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 21, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cc48dd3 and 90b1ec1.

📒 Files selected for processing (6)
  • container/Dockerfile.vllm (1 hunks)
  • examples/multimodal/components/encode_worker.py (3 hunks)
  • examples/multimodal/components/worker.py (2 hunks)
  • examples/multimodal/connect/README.md (0 hunks)
  • examples/multimodal/connect/__init__.py (0 hunks)
  • examples/multimodal/utils/protocol.py (3 hunks)
💤 Files with no reviewable changes (2)
  • examples/multimodal/connect/README.md
  • examples/multimodal/connect/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
examples/multimodal/utils/protocol.py (1)
lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)
  • RdmaMetadata (1417-1451)
examples/multimodal/components/worker.py (1)
lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)
  • Connector (501-720)
examples/multimodal/components/encode_worker.py (1)
lib/bindings/python/src/dynamo/nixl_connect/__init__.py (1)
  • Connector (501-720)
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2622/merge) by whoisj.
examples/multimodal/utils/protocol.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

examples/multimodal/components/worker.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

examples/multimodal/components/encode_worker.py

[error] 1-1: isort hook modified imports in this file (auto-fix).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo

Walkthrough

Refactors multimodal examples to use dynamo.nixl_connect instead of a local connect module, removes the local connect package and its README, updates protocol types from SerializedRequest to RdmaMetadata, adjusts encode_worker metadata source, and tweaks Dockerfile PYTHONPATH to avoid duplication.

Changes

Cohort / File(s) Summary of changes
Dockerfile env tweak
container/Dockerfile.vllm
Simplified PYTHONPATH export to append planner/src once; removed duplicate $PYTHONPATH reference.
Switch to dynamo.nixl_connect
examples/multimodal/components/worker.py, examples/multimodal/components/encode_worker.py
Replaced local connect import with dynamo.nixl_connect as connect. Connector constructed with Connector() (no args) instead of Connector(runtime=..., namespace=...).
Encoded request metadata source
examples/multimodal/components/encode_worker.py
request.serialized_request now taken from readable.metadata() instead of readable.to_serialized().
Remove local connect package
examples/multimodal/connect/README.md, examples/multimodal/connect/__init__.py
Deleted README and implementation providing local NIXL-based RDMA API (Connector, operations, descriptors, enums).
Protocol type updates
examples/multimodal/utils/protocol.py
Updated import to dynamo.nixl_connect. Replaced SerializedRequest types with RdmaMetadata in vLLMMultimodalRequest, EncodeRequest, and EncodeResponse.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant W as Worker (encode/worker)
  participant C as dynamo.nixl_connect.Connector
  participant R as Readable Buffer

  Note over W,C: Initialization flow
  W->>C: Connector()
  W->>C: initialize(runtime, namespace)

  Note over W,R: Encode path metadata production
  W->>R: create_readable(...)
  R-->>W: buffer handle
  W->>R: metadata()
  R-->>W: RdmaMetadata
  W->>W: request.serialized_request = RdmaMetadata
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

A whisk of wires, a hop of code,
I nibbled docs and lightened load.
New roads to RDMA I tread,
With fresher names to forge ahead—
PYTHONPATH trimmed, I leap with glee,
Connector’s sleek as I can be.
— a happy dev rabbit 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@krishung5
Copy link
Contributor

LGTM as well! Please make sure the vllm multi-gpu test passed on GitLab. (Multi-gpu test will not run on GitHub).

@rmccorm4
Copy link
Contributor

/ok to test 8e8726d

Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just triggered gitlab tests per Kris's comment: #2622 (comment)

@rmccorm4
Copy link
Contributor

There's so much noise in the log it's hard for me to tell, but if I'm reading it right it seems like the multimodal related parts actually passed:

[TEST] 2025-08-21T23:52:04 INFO test_serve_deployment[multimodal_agg]: Response<Response [200]>
[TEST] 2025-08-21T23:52:04 INFO test_serve_deployment[multimodal_agg]: Received Content:  The image features a large bus parked on a street, with a "Out of Service" sign displayed on its front.Ъ. The bus is positioned near a tree, and there are a few traffic lights in the vicinity. The scene appears to be set in a city, with the bus possibly waiting for its next assignment or being temporarily taken out of service.

There are ton of (what look like red herring) failures like this, I'm assuming they are sending inference requests as a "health"/"ready" check:

2025-08-21T23:51:57.230907Z ERROR encode_worker.generate: Error processing request 9ac173db104542d4b3069c96d97c6e40: no instances found for endpoint "instances/dynamo/llm/generate"   
[BASH] 
[BASH] Traceback (most recent call last):
[BASH]   File "/workspace/examples/multimodal/components/encode_worker.py", line 159, in generate
[BASH]     response_generator = await self.pd_worker_client.round_robin(
[BASH]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[BASH] Exception: no instances found for endpoint "instances/dynamo/llm/generate"
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] Error in chat completion stream generator.
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] Traceback (most recent call last):
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]   File "/opt/vllm/vllm/entrypoints/openai/serving_chat.py", line 545, in chat_completion_stream_generator
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]     async for res in result_generator:
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]   File "/workspace/examples/multimodal/components/processor.py", line 185, in _generate_responses
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051]     async for resp in response_generator:
[BASH] ERROR 08-21 23:51:57 [serving_chat.py:1051] ValueError: a python exception was caught while processing the async generator: Exception: no instances found for endpoint "instances/dynamo/llm/generate"
[BASH] 2025-08-21T23:51:57.231950Z  WARN http-request: dynamo_runtime::pipeline::network::egress::addressed_router: Failed deserializing JSON to response err=missing field `id` at line 1 column 233 json_str={"data":{"data":{"error":{"code":400,"message":"a python exception was caught while processing the async generator: Exception: no instances found for endpoint \"instances/dynamo/llm/generate\"","param":null,"type":"BadRequestError"}}},"complete_final":false} method=POST uri=/v1/chat/completions version=HTTP/1.1
[BASH] 2025-08-21T23:51:57.231995Z ERROR http-request: dynamo_llm::http::service::openai: Failed to fold chat completions stream for: "missing field `id` at line 1 column 233" request_id="52c184fa-1959-46da-9697-5409123d809c" method=POST uri=/v1/chat/completions version=HTTP/1.1
[BASH] 2025-08-21T23:51:57.232005Z ERROR http-request: dynamo_llm::http::service::openai: Internal server error: Failed to fold chat completions stream: missing field `id` at line 1 column 233 method=POST uri=/v1/chat/completions version=HTTP/1.1

Not for this PR, but we need to reduce the noise from the logs to make it easier to read. Some ideas:

  1. Switch to /health endpoint over sending inference requests, and check its output to make sure all expected endpoints have spun up.
  2. I see some DeepEP/IBGDA failures - I think I saw @krishung5 add an H100 marker for these tests. Is it still running on A100s as well?
[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/host/transport/transport.cpp:nvshmemi_transport_init:275: init failed for transport: IBGDA
[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/modules/transport/ibgda/ibgda.cpp:nvshmemt_init:3632: neither nv_peer_mem, or nvidia_peermem detected. Skipping transport.
[BASH] 
[BASH] /opt/vllm/tools/ep_kernels/ep_kernels_workspace/nvshmem_src/src/host/transport/transport.cpp:nvshmemi_transport_init:275: init failed for transport: IBGDA
  1. I think we can demote all these types of lines to a log file or something instead of stdout:
INFO     VLLMProcess:managed_process.py:31 Terminating PID: 1027 name: python

@krishung5
Copy link
Contributor

Thanks @rmccorm4 for the summary.
I checked the test as well, and it looks like multimodal case is indeed passing.
For 1, agree that we can switch to the health checkpoint. You're right about the error logs, inferences are sent for checking if the server is ready. By using the health checkpoint the log should be cleaner.
For 2, There's a GitLab MR for the fix that was merged after the pipeline gets triggered, so it's running the deepep test. We can ignore that one for now.
3. agree!

@rmccorm4 rmccorm4 merged commit 0a71aea into ai-dynamo:main Aug 22, 2025
11 of 13 checks passed
@rmccorm4 rmccorm4 changed the title feat: Remove Multimodel Example Connect feat: Remove Duplicate Multimodal Nixl Connect Example Aug 22, 2025
@whoisj whoisj deleted the whoisj/nixl_connector/migrate branch August 25, 2025 16:25
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
nnshah1 pushed a commit that referenced this pull request Sep 8, 2025
Signed-off-by: nnshah1 <neelays@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Remove duplicate nixl_connect implementations

4 participants