You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have observed a rare scenario with AsyncLLM where a client disconnect
triggers an abort request after the request has finished, but before
AsyncLLM has processed the request output.
See vllm-project#26012, vllm-project#25067, vllm-project#25844, and llm-d/llm-d#187.
Without the fix, the unit test fails with:
```
logger.warning(
"Releasing expired KV blocks for request %s which were "
"retrieved by %d decode worker(s) within %d seconds.",
req_id,
count,
envs.VLLM_NIXL_ABORT_REQUEST_TIMEOUT,
)
> self._reqs_to_process.remove(req_id)
E KeyError: '0'
vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py:1238: KeyError
```
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
0 commit comments