Skip to content

Conversation

@MasterJH5574
Copy link
Contributor

This PR introduces the initial KV cache interface setup for multi-head latent attention in DeepSeek models.

Some interface implementations are marked todo for implementation in the soon future.

This PR introduces the initial KV cache interface setup for multi-head
latent attention in DeepSeek models.

Some interface implementations are marked todo for implementation
in the soon future.
@MasterJH5574
Copy link
Contributor Author

@tvm-bot rerun

@jinhongyii jinhongyii merged commit 8b4df72 into apache:main Jan 31, 2025
19 checks passed
ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
This PR introduces the initial KV cache interface setup for multi-head
latent attention in DeepSeek models.

Some interface implementations are marked todo for implementation
in the soon future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants