-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Runtime] Support PagedKVCache with tree attention (#17049)
* [Runtime] Support PagedKVCache with tree attention This PR introduces the tree attention to PagedKVCache. With this feature, now the KV cache is ready for tree attention cases such as speculative decoding trees. This PR adds tree attention tests to test the correctness. The changes in this PR to KVState interface are backward compatible. * Update kv_state.cc * Update kv_state.cc --------- Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com>
- Loading branch information
1 parent
515c079
commit 31f4721
Showing
5 changed files
with
1,149 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.