-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[V1][TPU] Integrate the new ragged paged attention kernel with vLLM v1 on TPU #13379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mgoin
merged 21 commits into
vllm-project:main
from
vanbasten23:xiowei/tpu_v1_kernel_integration_take2
Feb 28, 2025
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
8e5bbb3
merge prompt and decode
vanbasten23 d0eac0f
add more comments
vanbasten23 2830ed4
cleaned up a bit
vanbasten23 2316f14
disable print, enable torch.compile
vanbasten23 f5d5429
pad block_table 2nd dim to a multiple of 128 to accomodate the kernel.
vanbasten23 08cda8f
Updated the torch_xla pin again: the smem oom is gone. Also use the r…
vanbasten23 89ea8f1
remove total_num_scheduled_tokens from attn_metadata. But it didn't h…
vanbasten23 6520319
pull total_num_scheduled_tokens and logits_indices out of ModelWrappe…
vanbasten23 e272741
change position_ids to use int32 instead of int64
vanbasten23 8f28a14
fix merge conflict
vanbasten23 ea03585
removed my comments
vanbasten23 84ec082
remove some comments. Fix mypy annotation.
vanbasten23 c9096e3
correctly initiate the query_start_loc in dummy_run
vanbasten23 bb08ce9
resolve mergee conflict
vanbasten23 1984125
after rebase, it couldnt run. I fixed some issues so it runs to compl…
vanbasten23 e8a7f9b
clean up
vanbasten23 1a942d5
run linter
vanbasten23 0592890
Bump torch_xla version again
vanbasten23 b9bac9a
Fix lint issues
mgoin 6bf9e68
Revert basic
mgoin 175224a
Keep import torch_xla.experimental.custom_kernel
mgoin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.