Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add mask to merge_state_in_place #372

Merged
merged 1 commit into from
Jul 13, 2024

Conversation

Yard1
Copy link
Contributor

@Yard1 Yard1 commented Jul 13, 2024

This pushes down the conditional logic to the kernel, allowing for better CUDA graph support with variable sequence length. I didn't see much purpose in adding the mask parameter to the out of place merge state kernels.

Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mask looks like a good feature to have, thank you @Yard1 !

@yzh119 yzh119 merged commit e14fa81 into flashinfer-ai:main Jul 13, 2024
@Yard1 Yard1 deleted the cascade_with_mask branch July 13, 2024 02:53
yzh119 pushed a commit that referenced this pull request Jul 17, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](v0.0.9...v0.1.0)
(2024-07-17)


### Features

* Add mask to `merge_state_in_place`
([#372](#372))
([e14fa81](e14fa81))
* expose pytorch api for block sparse attention
([#375](#375))
([4bba6fa](4bba6fa))
* Fused GPU sampling kernel for joint top-k & top-p sampling
([#374](#374))
([6e028eb](6e028eb))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants