Skip to content

[Bug]: [v1/core/block_pool.py] Assertion Failure: prev_block.block_hash is not None #21992

@QierLi

Description

@QierLi

Environment

vLLM V1
T16 single host
LLAMA4 Maverick TP8
Will crash per hours because of this engine failure.

Code Pointer

In cache_full_blocks()

if num_cached_blocks == num_full_blocks:
return
new_full_blocks = blocks[num_cached_blocks:num_full_blocks]
assert len(block_hashes) >= num_cached_blocks
new_block_hashes = block_hashes[num_cached_blocks:]
# Update the new blocks with the block hashes through the chain.
if num_cached_blocks == 0:
prev_block_hash_value = None
else:
prev_block = blocks[num_cached_blocks - 1]
assert prev_block.block_hash is not None
prev_block_hash_value = prev_block.block_hash.get_hash_value()

Stack Trace

Line numbers may be inaccurate

ERROR EngineCore encountered a fatal error.
ERROR Traceback (most recent call last):
ERROR   File "engine/core.py", line 640, in run_engine_core
ERROR     engine_core.run_busy_loop()
ERROR   File "engine/core.py", line 667, in run_busy_loop
ERROR     self._process_engine_step()
ERROR   File "engine/core.py", line 692, in _process_engine_step
ERROR     outputs, model_executed = self.step_fn()
ERROR   File "engine/core.py", line 280, in step
ERROR     scheduler_output = self.scheduler.schedule()
ERROR   File "core/sched/scheduler.py", line 440, in schedule
ERROR     new_blocks = self.kv_cache_manager.allocate_slots(
ERROR   File "core/kv_cache_manager.py", line 302, in allocate_slots
ERROR     self.coordinator.cache_blocks(
ERROR   File "core/kv_cache_coordinator.py", line 113, in cache_blocks
ERROR     manager.cache_blocks(request, block_hashes, num_computed_tokens)
ERROR   File "core/single_type_kv_cache_manager.py", line 146, in cache_blocks
ERROR     self.block_pool.cache_full_blocks(
ERROR   File "core/block_pool.py", line 138, in cache_full_blocks
ERROR     assert prev_block.block_hash is not None
ERROR AssertionError
ERROR AsyncLLM output_handler failed.
ERROR Traceback (most recent call last):
ERROR   File "engine/async_llm.py", line 379, in output_handler
ERROR     outputs = await engine_core.get_output_async()
ERROR   File "engine/core_client.py", line 764, in get_output_async
ERROR     raise self._format_exception(outputs) from None
ERROR EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root 

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions