-
Notifications
You must be signed in to change notification settings - Fork 31
Open
vllm-project/vllm
#27480Description
Issue: ValueError in interns1_vit.py when serving intern-s1 model
Environment
- OS: Ubuntu 22.04
- CUDA: 12.8.1
- Python: 3.11
- PyTorch: 2.8.0
- vLLM: 0.11.0
- Transformers: 4.57.1
Command
vllm serve intern-s1 \
--port 8008 \
--host 0.0.0.0 \
--served-model-name intern-s1 \
--trust-remote-code \
--gpu-memory-utilization 0.9 \
--tensor-parallel-size 8Error
File "/usr/local/lib/python3.11/site-packages/vllm/model_executor/models/interns1_vit.py", line 221, in forward
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] B_, N_, H_, D_ = q.shape
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] ^^^^^^^^^^^^^^
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] ValueError: not enough values to unpack (expected 4, got 3)
Description
When trying to serve the intern-s1 model using vLLM, the service fails with a ValueError in the interns1_vit.py file. The error occurs at line 221 where the code attempts to unpack 4 values from q.shape but only receives 3 values.
This suggests there's a shape mismatch in the tensor dimensions, likely related to the model's vision transformer implementation. The error occurs during the forward pass of the model execution.
Expected Behavior
The model should load and serve successfully without shape unpacking errors.
Metadata
Metadata
Assignees
Labels
No labels