Skip to content

Conversation

@faaany
Copy link
Contributor

@faaany faaany commented Oct 20, 2025

Purpose

This PR fixes a bug in the profiling.py file that caused the following error when processing video frames:

ERROR 10-18 04:53:44 [multiproc_executor.py:585] Traceback (most recent call last):

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/usr/local/lib/python3.12/dist-packages/PIL/Image.py", line 3285, in fromarray

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     typemode, rawmode, color_modes = _fromarray_typemap[typekey]

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                                      ~~~~~~~~~~~~~~~~~~^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585] KeyError: ((1, 1, 3), '<i8')

ERROR 10-18 04:53:44 [multiproc_executor.py:585] 

ERROR 10-18 04:53:44 [multiproc_executor.py:585] The above exception was the direct cause of the following exception:

ERROR 10-18 04:53:44 [multiproc_executor.py:585] 

ERROR 10-18 04:53:44 [multiproc_executor.py:585] Traceback (most recent call last):

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/inputs/registry.py", line 173, in call_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     output = hf_processor(**data,

ERROR 10-18 04:53:44 [multiproc_executor.py:585]              ^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 639, in __call__

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     text, video_inputs = self._preprocess_video(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                          ^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 598, in _preprocess_video

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     pixel_values_lst_video = self._videos_to_pixel_values_lst(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 580, in _videos_to_pixel_values_lst

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     video_to_pixel_values_internvl(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 302, in video_to_pixel_values_internvl

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     Image.fromarray(frame, mode="RGB"),

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/usr/local/lib/python3.12/dist-packages/PIL/Image.py", line 3289, in fromarray

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     raise TypeError(msg) from e

ERROR 10-18 04:53:44 [multiproc_executor.py:585] TypeError: Cannot handle this data type: (1, 1, 3), <i8

ERROR 10-18 04:53:44 [multiproc_executor.py:585] 

ERROR 10-18 04:53:44 [multiproc_executor.py:585] The above exception was the direct cause of the following exception:

ERROR 10-18 04:53:44 [multiproc_executor.py:585] 

ERROR 10-18 04:53:44 [multiproc_executor.py:585] Traceback (most recent call last):

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 559, in worker_main

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     worker = WorkerProc(*args, **kwargs)

ERROR 10-18 04:53:44 [multiproc_executor.py:585]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 420, in __init__

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     self.worker.init_device()

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/worker/worker_base.py", line 611, in init_device

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     self.worker.init_device()  # type: ignore

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     ^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/worker/xpu_worker.py", line 161, in init_device

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     self.model_runner = XPUModelRunner(  # type: ignore

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/worker/xpu_model_runner.py", line 27, in __init__

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     super().__init__(vllm_config, device)

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/worker/gpu_model_runner.py", line 396, in __init__

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     self.mm_budget = MultiModalBudget(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                      ^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/v1/worker/utils.py", line 48, in __init__

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     .get_max_tokens_per_item_by_nonzero_modality(model_config,

ERROR 10-18 04:53:44 [multiproc_executor.py:585]      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/registry.py", line 168, in get_max_tokens_per_item_by_nonzero_modality

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     max_tokens_per_item = self.get_max_tokens_per_item_by_modality(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/registry.py", line 144, in get_max_tokens_per_item_by_modality

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     return profiler.get_mm_max_contiguous_tokens(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/profiling.py", line 311, in get_mm_max_contiguous_tokens

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     return self._get_mm_max_tokens(seq_len,

ERROR 10-18 04:53:44 [multiproc_executor.py:585]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/profiling.py", line 291, in _get_mm_max_tokens

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     mm_inputs = self._get_dummy_mm_inputs(seq_len, mm_counts)

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/profiling.py", line 173, in _get_dummy_mm_inputs

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     return self.processor.apply(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]            ^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1808, in apply

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     ) = self._cached_apply_hf_processor(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1598, in _cached_apply_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     ) = self._apply_hf_processor_main(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1352, in _apply_hf_processor_main

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     mm_processed_data = self._apply_hf_processor_mm_only(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1309, in _apply_hf_processor_mm_only

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     _, mm_processed_data, _ = self._apply_hf_processor_text_mm(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1236, in _apply_hf_processor_text_mm

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     processed_data = self._call_hf_processor(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                      ^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 953, in _call_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     processed_outputs = super()._call_hf_processor(prompt, mm_data,

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/model_executor/models/internvl.py", line 778, in _call_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     processed_outputs = super()._call_hf_processor(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/multimodal/processing.py", line 1197, in _call_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     return self.info.ctx.call_hf_processor(

ERROR 10-18 04:53:44 [multiproc_executor.py:585]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERROR 10-18 04:53:44 [multiproc_executor.py:585]   File "/workspace/vllm/vllm/inputs/registry.py", line 193, in call_hf_processor

ERROR 10-18 04:53:44 [multiproc_executor.py:585]     raise ValueError(msg) from exc

ERROR 10-18 04:53:44 [multiproc_executor.py:585] ValueError: Failed to apply InternVLProcessor on data={'text': '<image><video>', 'images': [<PIL.Image.Image image mode=RGB size=5376x448 at 0x7A21E1E09D90>], 'videos': [array([[[[255, 255, 255],

The issue occurred because the np.full created a NumPy array with the default data type int64 (<i8), which is not supported by PIL.Image.fromarray. The Image.fromarray function requires the array to have a supported data type, such as uint8 for RGB images.

Related Issue: https://stackoverflow.com/questions/79792709/how-can-i-serve-opengvlab-internvl3-1b-with-vllm-getting-valueerror-failed-to and

Test Plan

VLLM_USE_V1=1 VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn python3 -m vllm.entrypoints.openai.api_server --model OpenGVLab/InternVL3_5-14B --enforce-eager --port 8000 --host 0.0.0.0 --trust-remote-code --gpu-memory-util=0.9 --quantization fp8 --no-enable-prefix-caching --max-num-batched-tokens=8192 --disable-log-requests --max-model-len=30000 --block-size 64 --dtype=float16 -tp=4

Below is a simple UT for the same issue:

import numpy as np
from PIL import Image

num_frames = 1
width = 25
height = 25

video = np.full((num_frames, width, height, 3), 255) # needs to add dtype=np.uint8

for frame in video:
    img = Image.fromarray(frame, mode="RGB")
    print(img)

By modifying the dtype, the UT can pass.

Test Result

Vllm Server can be started.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the multi-modality Related to multi-modality (#4194) label Oct 20, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly resolves a TypeError encountered during multimodal profiling when creating dummy video frames. The root cause was np.full defaulting to an int64 dtype, which is unsupported by PIL.Image.fromarray. By explicitly setting dtype=np.uint8, the change ensures compatibility with Pillow and also leads to more accurate memory profiling for video data. The fix is precise and effective. I've examined the related code and found no other instances of this issue. The change is good to merge.

@faaany
Copy link
Contributor Author

faaany commented Oct 20, 2025

cc @jikunshang @yma11

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks

@DarkLight1337
Copy link
Member

Can you fix DCO?

@faaany faaany force-pushed the fix_video_dtype branch 3 times, most recently from deac577 to b165fa3 Compare October 21, 2025 01:07
@mergify mergify bot added ci/build rocm Related to AMD ROCm v1 labels Oct 21, 2025
@faaany faaany closed this Oct 21, 2025
@faaany
Copy link
Contributor Author

faaany commented Oct 21, 2025

changes included in PR #27107. Close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build multi-modality Related to multi-modality (#4194) rocm Related to AMD ROCm v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants