[Kernel] Support DCP for Triton backend #25132

frank-wei · 2025-09-18T03:39:26Z

Purpose

As a follow up for #23734, this PR made some changes to support triton backend for DCP.
Specifically, 1) return the LSE from triton kernel 2) fix a bug in deepseekV2 which could potentially modify the residual variable.

Test Plan

export CUDA_VISIBLE_DEVICES=4,5,6,7
export VLLM_USE_V1=1
export VLLM_ATTENTION_BACKEND=TRITON_MLA
export VLLM_LOG_LEVEL=DEBUG
pytest tests/distributed/test_context_parallel.py -s

Test Result

=============================== warnings summary ===============================
:488
:488: DeprecationWarning: builtin type SwigPyPacked has no module attribute

:488
:488: DeprecationWarning: builtin type SwigPyObject has no module attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================== 2 passed, 2 warnings in 212.42s (0:03:32) ===================

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request extends Distributed Context Parallelism (DCP) support to the Triton backend by enabling the return of Log-Sum-Exp (LSE) values from attention kernels. The changes are generally well-implemented, but I have identified a critical issue where the Multi-Head Attention (MHA) path appears to be broken due to an incomplete function signature update. Additionally, a temporary test script with user-specific configurations seems to have been included by mistake and should be removed.

houseroad

Do we have a perf comparison?

frank-wei · 2025-09-18T06:42:09Z

Do we have a perf comparison?

I do not have a perf comparison. This is more on functional support especially adding LSE return in triton kernel. It won't impact the existing triton kernel performance. Now, I think vllm has completed flashMLA, FA, triton backend support for CP.

youkaichao

thanks! this will help running MLA on ampere or lower end GPUs.

please fix the pre-commit errors.

mergify · 2025-09-23T03:39:48Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @frank-wei.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Wei Wei <wwei6@meta.com>

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Wei Wei <wwei6@meta.com>

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

frank-wei requested review from LucasWilkinson, WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners September 18, 2025 03:39

mergify bot added deepseek Related to DeepSeek models v1 labels Sep 18, 2025

gemini-code-assist bot reviewed Sep 18, 2025

View reviewed changes

houseroad reviewed Sep 18, 2025

View reviewed changes

youkaichao approved these changes Sep 23, 2025

View reviewed changes

frank-wei force-pushed the triton-dcp branch from c63600f to 9502928 Compare September 23, 2025 03:38

frank-wei requested review from ApostaC, DarkLight1337, NickLucche, aarnphm, bigPYJ1151, gshtras, heheda12345, hmellor, jeejeelee, jikunshang, mgoin, patrickvonplaten, russellb, sighingnow, tlrmchlsmth and yewentao256 as code owners September 23, 2025 03:38

mergify bot added the needs-rebase label Sep 23, 2025

frank-wei force-pushed the triton-dcp branch from 9502928 to 8d69e8c Compare September 23, 2025 03:51

mergify bot removed tpu Related to Google TPUs needs-rebase labels Sep 23, 2025

frank-wei added 5 commits September 22, 2025 20:55

draft done but needs accuracy debug

3691cec

Signed-off-by: Wei Wei <wwei6@meta.com>

more changes

6e2681c

Signed-off-by: Wei Wei <wwei6@meta.com>

uncomment test case

24db79c

Signed-off-by: Wei Wei <wwei6@meta.com>

remove test script

7779926

Signed-off-by: Wei Wei <wwei6@meta.com>

pre-commit

b4e0647

Signed-off-by: Wei Wei <wwei6@meta.com>

frank-wei force-pushed the triton-dcp branch from 7e1eb6c to b4e0647 Compare September 23, 2025 03:55

Merge branch 'main' into triton-dcp

a6cd5ea

22quinn approved these changes Sep 23, 2025

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Sep 23, 2025

22quinn enabled auto-merge (squash) September 23, 2025 18:22

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025

fix test_triton_decode_attention.py

d4e023a

Signed-off-by: Wei Wei <wwei6@meta.com>

auto-merge was automatically disabled September 24, 2025 17:01
Head branch was pushed to by a user without write access

frank-wei added 3 commits September 24, 2025 10:23

pre-commit

16ef2eb

Signed-off-by: Wei Wei <wwei6@meta.com>

pre-commit

444c8b3

Signed-off-by: Wei Wei <wwei6@meta.com>

fix unit test

30745df

Signed-off-by: Wei Wei <wwei6@meta.com>

22quinn merged commit 05c1948 into vllm-project:main Sep 25, 2025
50 checks passed

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Sep 25, 2025

github-project-automation bot moved this to Done in Tool Calling Sep 25, 2025

github-project-automation bot moved this to Done in Structured Output Sep 25, 2025

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Kernel] Support DCP for Triton backend (#25132)

b839194

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Kernel] Support DCP for Triton backend (vllm-project#25132)

6b54c50

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025

[Kernel] Support DCP for Triton backend (vllm-project#25132)

dc907c3

Signed-off-by: Wei Wei <wwei6@meta.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Kernel] Support DCP for Triton backend (vllm-project#25132)

5878021

Signed-off-by: Wei Wei <wwei6@meta.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Kernel] Support DCP for Triton backend (vllm-project#25132)

db8dae5

Signed-off-by: Wei Wei <wwei6@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Kernel] Support DCP for Triton backend #25132

[Kernel] Support DCP for Triton backend #25132

Uh oh!

frank-wei commented Sep 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

houseroad left a comment

Uh oh!

frank-wei commented Sep 18, 2025

Uh oh!

youkaichao left a comment

Uh oh!

mergify bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Uh oh!

[Kernel] Support DCP for Triton backend #25132

[Kernel] Support DCP for Triton backend #25132

Uh oh!

Conversation

frank-wei commented Sep 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================== 2 passed, 2 warnings in 212.42s (0:03:32) ===================

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

frank-wei commented Sep 18, 2025

Uh oh!

youkaichao left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frank-wei commented Sep 18, 2025 •

edited by github-actions bot

Loading

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================== 2 passed, 2 warnings in 212.42s (0:03:32) ===================