Skip to content

[Bug]: V0 Scheduler is incapable of the newest KVCacheManager interface in vllm main branch code #861

@gawainx

Description

@gawainx

Your current environment

The output of `python collect_env.py`
vllm-ascend: main branch
vllm: main branch

🐛 Describe the bug

First, I'm sorry for omitting some env details here, because we found this bug in our private server environment.
In short summary, in vllm-ascend, we use additional config to enable v0 style scheduler for better performance. However, when we install the newest code d066e52013be278c7a3bc54ec9799d8457895f4d of vllm and 218f21d..68fb634 of vllm-ascend, we encountered errors when dealing with requests such as

Runtime Error: object of type KVCacheBlocks has no len()  

What happened?

The root cause of this problem is that, recently, the vllm project has rewritten the following methods of KVCacheManager (details can be found at this PR):

  • Introduce KVCacheBlocks
  • get_computed_blocks method returns tuple[KVCacheBlocks, int] instead of Tuple[List[BlockHashType], int]
  • allocate_slots has one extra arg named num_new_computed_tokens and returns Optional[KVCacheBlocks]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions