Skip to content

Conversation

@FENP
Copy link
Contributor

@FENP FENP commented Oct 10, 2025

Purpose

#25444 change default CUDAGraphMode from PIECEWISE to FULL_AND_PIECEWISE. However, DCP do not support full cuda graphs now (#26022 (comment)). This PR change default CUDAGraphMode to PIECEWISE when enable DCP.

cc @youzhedian @youkaichao @LucasWilkinson

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where enabling Decode Context Parallelism (DCP) was incompatible with full CUDA graph modes. The proposed fix forces the cudagraph_mode to PIECEWISE when DCP is active. While the intent is correct, the implementation is overly aggressive and will override a user's explicit choice to disable CUDA graphs entirely (cudagraph_mode=NONE). My review includes a critical comment to refine this logic, ensuring it only downgrades from FULL modes to PIECEWISE and warns the user, without affecting NONE mode.

Comment on lines 351 to 352
if self.parallel_config.decode_context_parallel_size > 1:
self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This implementation unconditionally sets cudagraph_mode to PIECEWISE if decode context parallelism (DCP) is enabled. This is too aggressive as it will override a user's explicit choice to disable CUDA graphs (e.g., cudagraph_mode=NONE), which might be done for debugging purposes.

A better approach is to only downgrade the mode to PIECEWISE if a FULL CUDA graph mode was requested, as those are the ones incompatible with DCP. This change also adds a warning to inform the user about the automatic adjustment.

if self.parallel_config.decode_context_parallel_size > 1 and \
                    self.compilation_config.cudagraph_mode.has_full_cudagraphs():
                    logger.warning(
                        "Decode context parallel (DCP) is enabled, which is "
                        "incompatible with full CUDA graphs. Downgrading "
                        "cudagraph_mode from %s to PIECEWISE.",
                        self.compilation_config.cudagraph_mode.name)
                    self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code snippets will only execute when cudagraph_mode is not explicitly set by users.

FENP added 2 commits October 10, 2025 16:38
Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Copy link
Member

@youkaichao youkaichao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @LucasWilkinson @youzhedian it should be possible to make dcp compatible with full cudagraph.

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
@youkaichao youkaichao enabled auto-merge (squash) October 11, 2025 03:13
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 11, 2025
@youkaichao youkaichao merged commit b91d8db into vllm-project:main Oct 12, 2025
46 checks passed
1994 pushed a commit to 1994/vllm that referenced this pull request Oct 14, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: 1994 <1994@users.noreply.github.com>
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
@FENP FENP deleted the bugfix/dcp_graph_mode branch October 15, 2025 09:00
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…oject#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants