[KVCache] Attention func accepting over-padded qkv and output NDArray #17401

MasterJH5574 · 2024-09-22T05:27:23Z

This PR enhances the AttentionWithFusedQKV function of PagedKVCache so that it can now accept input qkv_data and o_data that have padding along the sequence dimension.

We introduce this enhancement to allow more flexibility for the caller of PagedKVCache to decide whether to pad the input qkv/o NDArrays or not.

This PR enhances the `AttentionWithFusedQKV` function of `PagedKVCache` so that it can now accept input `qkv_data` and `o_data` that have padding along the sequence dimension. We introduce this enhancement to allow more flexibility for the caller of PagedKVCache to decide whether to pad the input qkv/o NDArrays or not.

tqchen approved these changes Sep 22, 2024

View reviewed changes

tqchen merged commit ce46185 into apache:main Sep 22, 2024

ysh329 mentioned this pull request Oct 16, 2024

[Release] v0.18.0 Release Candidate Notes #17468

Closed

kurisu6912 mentioned this pull request Sep 5, 2025

kurisu add assume attr patch 1 tile-ai/tvm#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[KVCache] Attention func accepting over-padded qkv and output NDArray #17401

[KVCache] Attention func accepting over-padded qkv and output NDArray #17401

Uh oh!

MasterJH5574 commented Sep 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[KVCache] Attention func accepting over-padded qkv and output NDArray #17401

[KVCache] Attention func accepting over-padded qkv and output NDArray #17401

Uh oh!

Conversation

MasterJH5574 commented Sep 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants