[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP #26574

FENP · 2025-10-10T08:23:14Z

Purpose

#25444 change default CUDAGraphMode from PIECEWISE to FULL_AND_PIECEWISE. However, DCP do not support full cuda graphs now (#26022 (comment)). This PR change default CUDAGraphMode to PIECEWISE when enable DCP.

cc @youzhedian @youkaichao @LucasWilkinson

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request addresses an issue where enabling Decode Context Parallelism (DCP) was incompatible with full CUDA graph modes. The proposed fix forces the cudagraph_mode to PIECEWISE when DCP is active. While the intent is correct, the implementation is overly aggressive and will override a user's explicit choice to disable CUDA graphs entirely (cudagraph_mode=NONE). My review includes a critical comment to refine this logic, ensuring it only downgrades from FULL modes to PIECEWISE and warns the user, without affecting NONE mode.

gemini-code-assist · 2025-10-10T08:24:50Z

vllm/config/vllm.py

+                    if self.parallel_config.decode_context_parallel_size > 1:
+                        self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE


This implementation unconditionally sets cudagraph_mode to PIECEWISE if decode context parallelism (DCP) is enabled. This is too aggressive as it will override a user's explicit choice to disable CUDA graphs (e.g., cudagraph_mode=NONE), which might be done for debugging purposes.

A better approach is to only downgrade the mode to PIECEWISE if a FULL CUDA graph mode was requested, as those are the ones incompatible with DCP. This change also adds a warning to inform the user about the automatic adjustment.

if self.parallel_config.decode_context_parallel_size > 1 and \ self.compilation_config.cudagraph_mode.has_full_cudagraphs(): logger.warning( "Decode context parallel (DCP) is enabled, which is " "incompatible with full CUDA graphs. Downgrading " "cudagraph_mode from %s to PIECEWISE.", self.compilation_config.cudagraph_mode.name) self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE

These code snippets will only execute when cudagraph_mode is not explicitly set by users.

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

youkaichao

cc @LucasWilkinson @youzhedian it should be possible to make dcp compatible with full cudagraph.

vllm/config/vllm.py

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: 1994 <1994@users.noreply.github.com>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: bbartels <benjamin@bartels.dev>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

Set default cudagraph mode to piecewise for DCP

fd6b08b

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

FENP requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, simon-mo, tlrmchlsmth, yewentao256 and youkaichao as code owners October 10, 2025 08:23

gemini-code-assist bot reviewed Oct 10, 2025

View reviewed changes

FENP added 2 commits October 10, 2025 16:38

add warning logs

d97095f

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

pre-commit check fix

dace6f0

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

youkaichao approved these changes Oct 10, 2025

View reviewed changes

hmellor reviewed Oct 10, 2025

View reviewed changes

vllm/config/vllm.py Outdated Show resolved Hide resolved

remove redundant check

a022317

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

youkaichao enabled auto-merge (squash) October 11, 2025 03:13

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 11, 2025

youkaichao merged commit b91d8db into vllm-project:main Oct 12, 2025
46 checks passed

1994 pushed a commit to 1994/vllm that referenced this pull request Oct 14, 2025

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (vllm-pr…

b2e80a2

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: 1994 <1994@users.noreply.github.com>

FENP deleted the bugfix/dcp_graph_mode branch October 15, 2025 09:00

bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (vllm-pr…

bb095f4

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com> Signed-off-by: bbartels <benjamin@bartels.dev>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (vllm-pr…

65bf7dd

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (vllm-pr…

9213c31

…oject#26574) Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP #26574

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP #26574

Uh oh!

FENP commented Oct 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 10, 2025

Uh oh!

FENP Oct 10, 2025

Uh oh!

youkaichao left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if self.parallel_config.decode_context_parallel_size > 1:
		self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE

Uh oh!

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP #26574

[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP #26574

Uh oh!

Conversation

FENP commented Oct 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

FENP Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

youkaichao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FENP commented Oct 10, 2025 •

edited by github-actions bot

Loading