Skip to content

vllm deploy error #29

@huazZeng

Description

@huazZeng

Issue: ValueError in interns1_vit.py when serving intern-s1 model

Environment

  • OS: Ubuntu 22.04
  • CUDA: 12.8.1
  • Python: 3.11
  • PyTorch: 2.8.0
  • vLLM: 0.11.0
  • Transformers: 4.57.1

Command

vllm serve intern-s1 \
      --port 8008 \
      --host 0.0.0.0 \
      --served-model-name intern-s1 \
      --trust-remote-code \
      --gpu-memory-utilization 0.9 \
      --tensor-parallel-size 8

Error

File "/usr/local/lib/python3.11/site-packages/vllm/model_executor/models/interns1_vit.py", line 221, in forward
 (Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671]     B_, N_, H_, D_ = q.shape
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671]     ^^^^^^^^^^^^^^
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] ValueError: not enough values to unpack (expected 4, got 3)

Description

When trying to serve the intern-s1 model using vLLM, the service fails with a ValueError in the interns1_vit.py file. The error occurs at line 221 where the code attempts to unpack 4 values from q.shape but only receives 3 values.

This suggests there's a shape mismatch in the tensor dimensions, likely related to the model's vision transformer implementation. The error occurs during the forward pass of the model execution.

Expected Behavior

The model should load and serve successfully without shape unpacking errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions