Skip to content

Commit b91d8db

Browse files
authored
[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (#26574)
Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
1 parent 045b396 commit b91d8db

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

vllm/config/vllm.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,15 @@ def __post_init__(self):
350350
or self.model_config.is_encoder_decoder
351351
):
352352
self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE
353+
354+
# decode context parallel do not support full cudagraphs now.
355+
if self.parallel_config.decode_context_parallel_size > 1:
356+
logger.warning(
357+
"Decode context parallel (DCP) is enabled, which is "
358+
"incompatible with full CUDA graphs. Set "
359+
"cudagraph_mode to PIECEWISE."
360+
)
361+
self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE
353362
else:
354363
self.compilation_config.cudagraph_mode = CUDAGraphMode.NONE
355364

0 commit comments

Comments
 (0)