You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the compact branch, in lmcache_vllm/scheduler_adapter.py, PreemptionMode is undefined on line 270. This will lead to a variable undefined error whenever a sequence group gets pre-empted, e.g. due to insufficient space for KV cache on GPU. To reproduce, try running any offline batch inference workload with a large batch size.
The text was updated successfully, but these errors were encountered:
Alex-q-z
changed the title
[Bug in compaction] Undefined variable on line 270
[Bug in compaction] Undefined variable in scheduler_adapter
Dec 6, 2024
On the compact branch, in
lmcache_vllm/scheduler_adapter.py
,PreemptionMode
is undefined on line 270. This will lead to a variable undefined error whenever a sequence group gets pre-empted, e.g. due to insufficient space for KV cache on GPU. To reproduce, try running any offline batch inference workload with a large batch size.The text was updated successfully, but these errors were encountered: