Skip to content

Comments

fix crash issue on triton paged_pa_mqa#1853

Merged
valarLip merged 2 commits intomainfrom
ganyi/fix_triton_version_paged_pa_mqa_logits
Jan 19, 2026
Merged

fix crash issue on triton paged_pa_mqa#1853
valarLip merged 2 commits intomainfrom
ganyi/fix_triton_version_paged_pa_mqa_logits

Conversation

@ganyi1996ppo
Copy link
Contributor

Motivation

Since vllm's nightly docker still remains in triton 3.4.0, We might still use this kernel until new triton version available. This PR fix the overflow issue of deepgemm_fp8_paged_mqa_logits_stage1. And after the triton upgraded to 3.5.0, vllm will fully adopt gluon implementation

Technical Details

Test Plan

Test Result

Submission Checklist

Signed-off-by: ganyi <ygan@amd.com>
@ganyi1996ppo ganyi1996ppo requested review from a team and Copilot January 16, 2026 05:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an accuracy issue in the Triton paged_pa_mqa kernel by addressing an overflow problem in the deepgemm_fp8_paged_mqa_logits_stage1 function. The fix converts stride_out_heads to a 64-bit integer to prevent overflow in pointer arithmetic calculations.

Changes:

  • Fixed overflow issue in pointer arithmetic by converting stride_out_heads to tl.int64
  • Updated copyright year to 2026

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ganyi1996ppo ganyi1996ppo changed the title fix accuracy issue on triton paged_pa_mqa fix crash issue on triton paged_pa_mqa Jan 16, 2026
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stride_out_heads : tl.int64,

Signed-off-by: ganyi <ygan@amd.com>
@valarLip valarLip merged commit 371a22f into main Jan 19, 2026
16 of 19 checks passed
@valarLip valarLip deleted the ganyi/fix_triton_version_paged_pa_mqa_logits branch January 19, 2026 06:29
yzhou103 pushed a commit that referenced this pull request Jan 28, 2026
* fix accuracy issue on triton paged_pa_mqa

Signed-off-by: ganyi <ygan@amd.com>

* add int64 annotation for input stride

Signed-off-by: ganyi <ygan@amd.com>

---------

Signed-off-by: ganyi <ygan@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants