llama : refactor kv cache guard #12695

ggerganov · 2025-04-01T17:10:47Z

Simplify the KV cache guard mechanism. Prepare for separate recurrent cache implementation.

Also, llama_decode now correctly returns 1 when the batch cannot fit in the KV cache and the KV cache state is correctly restored upon failure to process the batch.

ggml-ci

LostRuins · 2025-04-14T06:52:28Z

After this commit, it seems like RNN based models like RWKV don't work anymore, and asset at llama-kv-cache.cpp:594: GGML_ASSERT(empty_cell.is_empty()) failed. Reverting the early return line at

llama.cpp/src/llama-kv-cache.cpp

Line 208 in 626f822

return true;

seems to allow RWKV to work again.

cc: @MollySophia

ggerganov added 6 commits April 1, 2025 20:09

llama : refactor kv cache guard

f1d179e

ggml-ci

cont : fix comment [no ci]

4fdd6e5

llama : fix kv_cache restore logic

623954b

ggml-ci

context : simplify kv cache updates

5c84488

ggml-ci

cont : better name [no ci]

eb5518f

llama : fix llama_decode return code when could not find KV slot

2c41dff

ggml-ci

github-actions bot added the examples label Apr 2, 2025

ggerganov added 2 commits April 2, 2025 14:10

context : change log err -> warn [no ci]

8ab37b1

kv-cache : add comment + warning [no ci]

626f822

ggerganov merged commit a10b36c into master Apr 2, 2025
1 check passed

ggerganov deleted the gg/llama-kv-cache-v4 branch April 2, 2025 11:33

hnfong mentioned this pull request Apr 3, 2025

Eval bug: commit: no pending KV cache updates to commit - might indicate a bug #12730

Closed

This was referenced Apr 4, 2025

llama : add llama_batch_ext #11875

Open

kv-cache : simplify + fix warning for recurrent models #12756

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : refactor kv cache guard #12695

llama : refactor kv cache guard #12695

ggerganov commented Apr 1, 2025 •

edited

Loading

LostRuins commented Apr 14, 2025

llama : refactor kv cache guard #12695

llama : refactor kv cache guard #12695

Conversation

ggerganov commented Apr 1, 2025 • edited Loading

LostRuins commented Apr 14, 2025

ggerganov commented Apr 1, 2025 •

edited

Loading