Skip to content

Conversation

@LucasWilkinson
Copy link
Collaborator

@LucasWilkinson LucasWilkinson commented Oct 14, 2025

Preparatory refactor PR for: #24002 (#23789)

Separate out the attention metadata building logic from prepare inputs so we can re-use it for dummy_runs and to setup for padding for cuda-graphs before building attention metadata (see #23789 for motivation)

Moving _compute_cascade_attn_prefix_lens out of metadata building loops since whether or not we are using cascade attention is needed by cudagraph dispatcher that will happen before attention metadata building in #24002.

NOTE: renamed kv_cache_group_id to kv_cache_gid (common acronym in linux) to make more lines fit within the formatter max line width

@mergify mergify bot added the v1 label Oct 14, 2025
@LucasWilkinson LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025
@LucasWilkinson LucasWilkinson marked this pull request as ready for review October 14, 2025 16:49
@LucasWilkinson LucasWilkinson force-pushed the lwilkinson/seperate-build-attn-metadata branch from 3bdfe10 to e48ab54 Compare October 14, 2025 21:21
Copy link
Contributor

@fhl2000 fhl2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LucasWilkinson, thank you for refactoring this! Just left one question, otherwise looking great.

@LucasWilkinson LucasWilkinson force-pushed the lwilkinson/seperate-build-attn-metadata branch from db8f24c to e9387ea Compare October 25, 2025 08:51
Copy link
Contributor

@fhl2000 fhl2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! Any chance to also separate the attn_metadata building logic into a function for the Eagle drafter? So we can later reuse them for both the drafter execution and its dummy run (prepared for full CUDA graph support).

@mergify
Copy link

mergify bot commented Nov 8, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @LucasWilkinson.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 8, 2025
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
@LucasWilkinson LucasWilkinson force-pushed the lwilkinson/seperate-build-attn-metadata branch from 2814f5f to 651b9de Compare November 8, 2025 20:54
@mergify mergify bot removed the needs-rebase label Nov 8, 2025
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nothing clearly stood out as missed

@LucasWilkinson LucasWilkinson merged commit 636efd1 into vllm-project:main Nov 9, 2025
46 checks passed
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025
…puts (vllm-project#26764)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants