-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] How to store/pass past-key-values? #681
Comments
Common prefix prompts in a batch? |
Hi @YijiaZhao, thanks for your reply! Yes, indeed. I saw that you also worked on this (in Issue 680). Did you find a solution? |
I've finished yet. I think it's easy to store the past KV cache, but hard to modify the attention for postfix calculation. |
@YijiaZhao I am looking for this too. Were you able to crack it? Thanks! |
Also looking for this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi all,
I was wondering whether it is possible to store and pass on past key values from long prefix prompts. Has anyone tried this with vLLM before?
Many thanks!
The text was updated successfully, but these errors were encountered: