Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate all temporary ggml_tensor_extra_gpu from a fixed-size buffer #2220

Merged
merged 1 commit into from
Jul 14, 2023

Conversation

bullno1
Copy link
Contributor

@bullno1 bullno1 commented Jul 14, 2023

Fix #2145.

Use a fixed buffer as suggested here: #2195 (comment)

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes - very clever, like a ring buffer 🦙

Would be nice to free the buffer upon exit, but this is good enough for now

@ggerganov ggerganov merged commit 7cdd30b into ggerganov:master Jul 14, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[User] ggml_tensor->extra(s) are not freed at the end of llama_eval, causing a memory leak
2 participants