Fix memory management bug in llava and server code #5491

Elbios · 2024-02-14T16:16:24Z

Fixes this error:

llama_new_context_with_model: graph splits (measure): 3 Available slots:
-> Slot 0 - max context: 6000
{"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image
slot 0 is processing [task id: 0]
slot 0 : kv cache rm - [0, end)
slot 0 - encoding image [id: 1]
munmap_chunk(): invalid pointer
Aborted

when running the server binary like this:

./bin/server -m ../models/mistral-7b-q_5_k.gguf --mmproj ../models/mmproj-mistral7b-f16-q6_k.gguf -ngl 50 -c 6000 --host 0.0.0.0 --port 8007 --no-mmap

Tested on:
Linux, WSL (Debian)
GPU: 4090

Fixes this error: llama_new_context_with_model: graph splits (measure): 3 Available slots: -> Slot 0 - max context: 6000 {"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image slot 0 is processing [task id: 0] slot 0 : kv cache rm - [0, end) slot 0 - encoding image [id: 1] munmap_chunk(): invalid pointer Aborted

ggerganov · 2024-02-14T16:45:41Z

examples/llava/clip.h

+CLIP_API void clip_image_u8_batch_free (struct clip_image_u8  * data);
+CLIP_API void clip_image_f32_batch_free(struct clip_image_f32 * data);


Wouldn't it be better to change these to:

CLIP_API void clip_image_u8_batch_free (struct clip_image_u8_batch * batch) [ if (batch.size > 0) { delete[] batch.data; } batch.size = 0; }

Agreed, I changed it and retested.

Elbios · 2024-02-14T18:47:41Z

examples/llava/clip.cpp

@@ -1494,11 +1506,8 @@ bool clip_image_preprocess(struct clip_ctx * ctx, const clip_image_u8 * img, cli
        pad_to_square = false;
    }
    // free the previous res_imgs if any set
-    if (res_imgs.size > 0 && res_imgs.size < 100) {


oh, I removed the upper bound because there didn't seem to be any justification for it, but if there is then let me know @cmp-nct and I'll restore it

oh, I removed the upper bound because there didn't seem to be any justification for it, but if there is then let me know @cmp-nct and I'll restore it

The reason for the upper bound was a safety check in case the passed structure points to uninitialized memory, in that case it would almost certainly be outside that range.
So only relevant if someone uses it wrong, I'm fine either way.

Glad you spotted the double free, it's another remnant of the vector->pointer refactor.

ggerganov/llama.cpp#5491

* Fix memory management in llava and server code Fixes this error: llama_new_context_with_model: graph splits (measure): 3 Available slots: -> Slot 0 - max context: 6000 {"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image slot 0 is processing [task id: 0] slot 0 : kv cache rm - [0, end) slot 0 - encoding image [id: 1] munmap_chunk(): invalid pointer Aborted * Make it cleaner by checking size in batch free wrapper

Elbios mentioned this pull request Feb 14, 2024

Llava 1.6 support #5267

Merged

ggerganov reviewed Feb 14, 2024

View reviewed changes

Make it cleaner by checking size in batch free wrapper

098ab94

Elbios commented Feb 14, 2024

View reviewed changes

cmp-nct mentioned this pull request Feb 15, 2024

Some memory management bugs #5498

Closed

jpohhhh added a commit to Telosnex/fllama that referenced this pull request Feb 15, 2024

Import llava memory management commit

59992ea

ggerganov/llama.cpp#5491

ggerganov approved these changes Feb 15, 2024

View reviewed changes

ggerganov merged commit 0d41771 into ggerganov:master Feb 15, 2024
45 of 53 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory management bug in llava and server code #5491

Fix memory management bug in llava and server code #5491

Elbios commented Feb 14, 2024 •

edited

Loading

ggerganov Feb 14, 2024

Elbios Feb 14, 2024

Elbios Feb 14, 2024

cmp-nct Feb 14, 2024

		CLIP_API void clip_image_u8_batch_free (struct clip_image_u8 * data);
		CLIP_API void clip_image_f32_batch_free(struct clip_image_f32 * data);

Fix memory management bug in llava and server code #5491

Fix memory management bug in llava and server code #5491

Conversation

Elbios commented Feb 14, 2024 • edited Loading

ggerganov Feb 14, 2024

Choose a reason for hiding this comment

Elbios Feb 14, 2024

Choose a reason for hiding this comment

Elbios Feb 14, 2024

Choose a reason for hiding this comment

cmp-nct Feb 14, 2024

Choose a reason for hiding this comment

Elbios commented Feb 14, 2024 •

edited

Loading