Skip to content

Commit dab72bf

Browse files
russellbxuebwang-amd
authored andcommitted
[Core] Force PIECEWISE CUDAGraph mode for encoder-decoder (vllm-project#25701)
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
1 parent 04bcde1 commit dab72bf

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

vllm/config/__init__.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -364,9 +364,11 @@ def __post_init__(self):
364364
self.compilation_config.cudagraph_mode = \
365365
CUDAGraphMode.FULL_AND_PIECEWISE
366366

367-
# pooling model does not support full cudagraphs
367+
# pooling models and encoder-decoder models
368+
# do not support full cudagraphs
368369
if self.model_config is not None and \
369-
self.model_config.pooler_config is not None:
370+
(self.model_config.pooler_config is not None
371+
or self.model_config.is_encoder_decoder):
370372
self.compilation_config.cudagraph_mode = \
371373
CUDAGraphMode.PIECEWISE
372374
else:

0 commit comments

Comments
 (0)