Skip to content

[Bug]: nightly version: EngineCore encountered a fatal error. #17276

@Zhiyuan-Fan

Description

@Zhiyuan-Fan

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

🐛 Describe the bug

vllm serve Qwen/Qwen2.5-VL-3B-Instruct

INFO 04-28 02:37:03 [async_llm.py:252] Added request 13_chatcmpl-c78d39a6d1d8469f90f3bda9bd41ca6a.                                                                                                                                           
INFO 04-28 02:37:03 [async_llm.py:252] Added request 14_chatcmpl-c78d39a6d1d8469f90f3bda9bd41ca6a.                                                                                                                                           
INFO 04-28 02:37:03 [async_llm.py:252] Added request 15_chatcmpl-c78d39a6d1d8469f90f3bda9bd41ca6a.                                                                                                                                           
ERROR 04-28 02:37:03 [core.py:398] EngineCore encountered a fatal error.                                                                                                                                                                     
ERROR 04-28 02:37:03 [core.py:398] Traceback (most recent call last):                                                                                                                                                                        
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 389, in run_engine_core                                                                              
ERROR 04-28 02:37:03 [core.py:398]     engine_core.run_busy_loop()                                                                                                                                                                           
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 413, in run_busy_loop                                                                                
ERROR 04-28 02:37:03 [core.py:398]     self._process_engine_step()                                                                                                                                                                           
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 438, in _process_engine_step                                                                         
ERROR 04-28 02:37:03 [core.py:398]     outputs = self.step_fn()                                                                                                                                                                              
ERROR 04-28 02:37:03 [core.py:398]               ^^^^^^^^^^^^^^                                                                                                                                                                              
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 203, in step                                                                                         
ERROR 04-28 02:37:03 [core.py:398]     output = self.model_executor.execute_model(scheduler_output)                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 86, in execute_model                                                                           
ERROR 04-28 02:37:03 [core.py:398]     output = self.collective_rpc("execute_model",                                                                                                                                                         
ERROR 04-28 02:37:03 [core.py:398]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                         
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc                                                                     
ERROR 04-28 02:37:03 [core.py:398]     answer = run_method(self.driver_worker, method, args, kwargs)                                                                                                                                         
ERROR 04-28 02:37:03 [core.py:398]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                         
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/utils.py", line 2456, in run_method                                                                                           
ERROR 04-28 02:37:03 [core.py:398]     return func(*args, **kwargs)                                                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context                                                                         
ERROR 04-28 02:37:03 [core.py:398]     return func(*args, **kwargs)                                                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                          
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 268, in execute_model                                                                          
ERROR 04-28 02:37:03 [core.py:398]     output = self.model_runner.execute_model(scheduler_output)                                                                                                                                            
ERROR 04-28 02:37:03 [core.py:398]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^               
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 04-28 02:37:03 [core.py:398]     return func(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1092, in execute_model
ERROR 04-28 02:37:03 [core.py:398]     output = self.model( 
ERROR 04-28 02:37:03 [core.py:398]              ^^^^^^^^^^^ 
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 04-28 02:37:03 [core.py:398]     return self._call_impl(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 04-28 02:37:03 [core.py:398]     return forward_call(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1106, in forward
ERROR 04-28 02:37:03 [core.py:398]     hidden_states = self.language_model.model(
ERROR 04-28 02:37:03 [core.py:398]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 245, in __call__
ERROR 04-28 02:37:03 [core.py:398]     model_output = self.forward(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/model_executor/models/qwen2.py", line 325, in forward
ERROR 04-28 02:37:03 [core.py:398]     def forward(
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 04-28 02:37:03 [core.py:398]     return self._call_impl(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 04-28 02:37:03 [core.py:398]     return forward_call(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
ERROR 04-28 02:37:03 [core.py:398]     return fn(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
ERROR 04-28 02:37:03 [core.py:398]     return self._wrapped_call(self, *args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in __call__
ERROR 04-28 02:37:03 [core.py:398]     raise e
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in __call__
ERROR 04-28 02:37:03 [core.py:398]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 04-28 02:37:03 [core.py:398]     return self._call_impl(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 04-28 02:37:03 [core.py:398]     return forward_call(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "<eval_with_key>.74", line 270, in forward
ERROR 04-28 02:37:03 [core.py:398]     submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2, getitem_3);  getitem = getitem_1 = getitem_2 = submod_1 = None
ERROR 04-28 02:37:03 [core.py:398]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
ERROR 04-28 02:37:03 [core.py:398]     return self._wrapped_call(self, *args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                             02:37:03 [134/1860]
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
ERROR 04-28 02:37:03 [core.py:398]     return self._wrapped_call(self, *args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in __call__
ERROR 04-28 02:37:03 [core.py:398]     raise e
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in __call__
ERROR 04-28 02:37:03 [core.py:398]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 04-28 02:37:03 [core.py:398]     return self._call_impl(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 04-28 02:37:03 [core.py:398]     return forward_call(*args, **kwargs)
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "<eval_with_key>.2", line 5, in forward
ERROR 04-28 02:37:03 [core.py:398]     unified_attention_with_output = torch.ops.vllm.unified_attention_with_output(query_2, key_2, value, output_3, 'language_model.model.layers.0.self_attn.attn');  query_2 = key_2 = value = output_3 = u
nified_attention_with_output = None
ERROR 04-28 02:37:03 [core.py:398]                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/_ops.py", line 1123, in __call__
ERROR 04-28 02:37:03 [core.py:398]     return self._op(*args, **(kwargs or {}))
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/attention/layer.py", line 416, in unified_attention_with_output
ERROR 04-28 02:37:03 [core.py:398]     self.impl.forward(self,
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 598, in forward
ERROR 04-28 02:37:03 [core.py:398]     cascade_attention(
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 730, in cascade_attention
ERROR 04-28 02:37:03 [core.py:398]     prefix_output, prefix_lse = flash_attn_varlen_func(
ERROR 04-28 02:37:03 [core.py:398]                                 ^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/vllm_flash_attn/flash_attn_interface.py", line 252, in flash_attn_varlen_func
ERROR 04-28 02:37:03 [core.py:398]     out, softmax_lse, _, _ = torch.ops._vllm_fa3_C.fwd(
ERROR 04-28 02:37:03 [core.py:398]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/_ops.py", line 1123, in __call__
ERROR 04-28 02:37:03 [core.py:398]     return self._op(*args, **(kwargs or {}))
ERROR 04-28 02:37:03 [core.py:398]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [core.py:398] RuntimeError: scheduler_metadata must have shape (metadata_size)
Process EngineCore_0:
ERROR 04-28 02:37:03 [async_llm.py:399] AsyncLLM output_handler failed.
ERROR 04-28 02:37:03 [async_llm.py:399] Traceback (most recent call last):
ERROR 04-28 02:37:03 [async_llm.py:399]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 357, in output_handler
ERROR 04-28 02:37:03 [async_llm.py:399]     outputs = await engine_core.get_output_async()
ERROR 04-28 02:37:03 [async_llm.py:399]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-28 02:37:03 [async_llm.py:399]   File "/home/zhiyuan/anaconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 716, in get_output_async
ERROR 04-28 02:37:03 [async_llm.py:399]     raise self._format_exception(outputs) from None
ERROR 04-28 02:37:03 [async_llm.py:399] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
INFO 04-28 02:37:03 [async_llm.py:324] Request chatcmpl-931e30a747354f9eb969d74e0917a5b8 failed (engine dead).
INFO 04-28 02:37:03 [async_llm.py:324] Request chatcmpl-bb885cde9cc64d099805f73700577ffc failed (engine dead).
INFO 04-28 02:37:03 [async_llm.py:324] Request chatcmpl-e3401e02484243b48e9e73750055b16a failed (engine dead).```

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions