Skip to content

Commit aa5a77c

Browse files
LucasWilkinsongemini-code-assist[bot]
authored andcommitted
[BugFix] Disable fp8 kv-cache by default for DeepSeek V3.2 (vllm-project#27121)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
1 parent de01143 commit aa5a77c

File tree

1 file changed

+2
-5
lines changed

1 file changed

+2
-5
lines changed

vllm/model_executor/models/config.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -481,12 +481,9 @@ def verify_and_update_config(cls, vllm_config: "VllmConfig") -> None:
481481
is_v32 = hasattr(hf_config, "index_topk")
482482
assert is_v32
483483

484-
# For DeepSeekV3.2, we use a custom fp8 format as default (i.e.
485-
# "auto")
484+
# For DeepSeekV3.2, a custom fp8 format is used when fp8 kv-cache is enabled.
486485
cache_config = vllm_config.cache_config
487-
if cache_config.cache_dtype == "auto" or cache_config.cache_dtype.startswith(
488-
"fp8"
489-
):
486+
if cache_config.cache_dtype.startswith("fp8"):
490487
cache_config.cache_dtype = "fp8_ds_mla"
491488
logger.info("Using custom fp8 kv-cache format for DeepSeekV3.2")
492489
if cache_config.cache_dtype == "bfloat16":

0 commit comments

Comments
 (0)