[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video preprocess #24332

what-in-the-nim · 2025-09-05T16:45:03Z

This PR removes unnecessary CUDA sync in _process_image_input and _process_video_input of vllm/model_executor/models/glm4_1v.py by utilising grid_thw_list.

Related PR #22792
Related issue #23884

github-actions · 2025-09-05T16:45:10Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copilot

Pull Request Overview

This PR removes unnecessary CUDA synchronization in GLM-4.1V model by optimizing the image and video preprocessing methods. The changes prevent implicit CUDA syncs that occur when calling .tolist() on GPU tensors during size calculations.

Converts grid_thw tensor to list once at the beginning of each method
Replaces tensor operations with CPU-based calculations for computing split sizes
Uses the pre-converted list for both visual processing and size calculations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-05T16:45:36Z

vllm/model_executor/models/glm4_1v.py

Creating a new tensor from the list and then converting back to list is inefficient. Consider using numpy operations or native Python calculations since grid_thw_list is already a Python list.

Copilot · 2025-09-05T16:45:36Z

vllm/model_executor/models/glm4_1v.py

Creating a new tensor from the list and then converting back to list is inefficient. Consider using numpy operations or native Python calculations since grid_thw_list is already a Python list.

Signed-off-by: Win <chatcharinsang@gmail.com>

gemini-code-assist

Code Review

This pull request introduces a performance optimization in vllm/model_executor/models/glm4_1v.py by removing an unnecessary CUDA synchronization. The change involves pre-computing a list from the grid_thw tensor at the beginning of the _process_image_input and _process_video_input functions. This allows the subsequent calculation of sizes to be performed on the CPU, overlapping with the main GPU-bound visual processing task and thus avoiding a synchronization point after the main kernel launch. The implementation is correct and should lead to improved performance. Additionally, the change improves robustness by using torch.long for calculations, preventing potential overflows, and enhances readability by grouping (merge_size * merge_size).

…rocess (vllm-project#24332) Signed-off-by: Win <chatcharinsang@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>

…rocess (vllm-project#24332) Signed-off-by: Win <chatcharinsang@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Copilot AI review requested due to automatic review settings September 5, 2025 16:45

Copilot AI reviewed Sep 5, 2025

View reviewed changes

what-in-the-nim added 2 commits September 5, 2025 23:46

fix: process_image_input

b3ee609

Signed-off-by: Win <chatcharinsang@gmail.com>

fix: process_video_input

457da99

Signed-off-by: Win <chatcharinsang@gmail.com>

what-in-the-nim force-pushed the glm4_1v_fix branch from 3dbbbeb to 457da99 Compare September 5, 2025 16:46

gemini-code-assist bot reviewed Sep 5, 2025

View reviewed changes

ywang96 approved these changes Sep 5, 2025

View reviewed changes

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 5, 2025

Merge branch 'main' into glm4_1v_fix

609d531

ywang96 enabled auto-merge (squash) September 5, 2025 20:07

Merge branch 'main' into glm4_1v_fix

c7a7a2f

vllm-bot merged commit 8a46602 into vllm-project:main Sep 8, 2025
38 of 41 checks passed

skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video prep…

e8b362d

…rocess (vllm-project#24332) Signed-off-by: Win <chatcharinsang@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video prep…

42f7449

…rocess (vllm-project#24332) Signed-off-by: Win <chatcharinsang@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video preprocess #24332

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video preprocess #24332

Uh oh!

what-in-the-nim commented Sep 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 5, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 5, 2025

Uh oh!

Copilot AI Sep 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video preprocess #24332

[Model] Remove unnecessary CUDA sync of GLM-4.1V image and video preprocess #24332

Uh oh!

Conversation

what-in-the-nim commented Sep 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

what-in-the-nim commented Sep 5, 2025 •

edited by github-actions bot

Loading