Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Oct 14, 2025

Purpose

Standardize how multimodal embeddings from different modalities are merged in get_multimodal_embeddings:

  • Convert to tuple before assigning to the output to handle the case when the embeddings are tensors
  • Rename vision_embeddings to image_embeddings to avoid confusing with video_embeddings

FIX #26749

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request standardizes the merging of multimodal embeddings across various models. The changes involve renaming vision_embeddings to image_embeddings for better clarity and ensuring that embeddings are converted to a tuple before being concatenated. This improves code robustness by preventing potential TypeError exceptions when the underlying processing functions return lists instead of tuples. The changes are correct and improve the overall consistency and reliability of the codebase. No issues were found in this pull request.

@Isotr0py Isotr0py enabled auto-merge (squash) October 14, 2025 07:06
@DarkLight1337 DarkLight1337 added the multi-modality Related to multi-modality (#4194) label Oct 14, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 14, 2025 07:52
@DarkLight1337 DarkLight1337 merged commit d2f816d into vllm-project:main Oct 14, 2025
60 of 61 checks passed
@DarkLight1337 DarkLight1337 deleted the std-embeds branch October 14, 2025 09:36
@BlueBlueFF
Copy link

这里没有对image_embeddings的类型和维度校验? 直接变tuple是否有测试各种情况下的兼容性

@DarkLight1337
Copy link
Member Author

We check ndim inside MultiModalDataParser already. The logic should work for both list of ndim=2 tensors and a single ndim=3 tensor.

Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: bbartels <benjamin@bartels.dev>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Nov 12, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-modality Related to multi-modality (#4194) qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

3 participants