Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kv cache data pointers #1104

Merged
merged 1 commit into from
Apr 21, 2023
Merged

Conversation

xaedes
Copy link
Collaborator

@xaedes xaedes commented Apr 21, 2023

Currently the functions to set the kv_cache will overwrite the data pointers of the k and v tensors, as the pointer address is stored in the memory block (kv_self.buf) itself and then overwritten by memcpy.

Restoring the cache only works correctly when restoring from the same runtime session as the data pointers will not have changed.
I saw folks testing the kv_cache get and set by freeing the kv_cache ggml context, then making a new context and restoring to that. Probably the same memory block was allocated in the second context, so that it did not segfault.

When storing cache to file, restarting program and loading cache the pointers will be wrong and llama_eval will segfault.

To fix the problem, I remember the data pointers before memcpy overwrites kv_self.buf and then just restore them.

because their value is stored in buf and overwritten by memcpy
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart solution!

@ggerganov ggerganov merged commit 8687c1f into ggml-org:master Apr 21, 2023
jeroen-mostert added a commit to jeroen-mostert/llama.cpp that referenced this pull request Aug 30, 2024
…fixes ggml-org#1104)

Signed-off-by: Jeroen Mostert <jeroen.mostert@cm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants