server : fix kv cache management #3588

ggerganov · 2023-10-11T21:01:59Z

Try to fix the reported issue. Not tested

KerfuffleV2 · 2023-10-11T22:22:27Z

I checked to see if the colon thing could possibly be related to #3550 but that doesn't appear to be the case. Still easy to reproduce with this pull.

spencekim · 2023-10-11T23:59:51Z

I don't see the issue anymore using this repro: #3575 (comment)

Tried several hundred reqs, with zero appearances of the leading colons.

…example * 'master' of github.com:ggerganov/llama.cpp: (34 commits) examples: support LLaVA v1.5 (multimodal model) (ggerganov#3436) docs : fix typo GOMP_CPU_AFFINITY (ggerganov#3597) cmake : fix add_compile_options on macOS typo : it is `--n-gpu-layers` not `--gpu-layers` (ggerganov#3592) ci : check if there is enough VRAM (ggerganov#3596) server : add completion mode (no chat) (ggerganov#3582) prompts : add mnemonics.txt server : fix kv cache management (ggerganov#3588) main : fix session loading bug (ggerganov#3400) server : add parameter -tb N, --threads-batch N (ggerganov#3584) common : fix mirostat state when using multiple sequences (ggerganov#3543) batched : add bench tool (ggerganov#3545) examples : add batched.swift + improve CI for swift (ggerganov#3562) Add MPT model to supported models in README.md (ggerganov#3574) Minor improvements in GPT2 tokenizer (ggerganov#3567) readme : add bloom (ggerganov#3570) llm : add bloom models (ggerganov#3553) swift : improvements and fixes (ggerganov#3564) llm : add MPT support (ggerganov#3417) infill. : fix tokenization (ggerganov#3508) ...

blightbow · 2023-10-14T00:02:48Z

I see the issue as of commit 1e0e873. I've seen it more often when regenerating outputs. See if you can reproduce it more frequently by submitting the same input over and over again. My general observation is that once you've triggered an appearance of a colon, further submissions of the same input without advancing the conversation will cause an additional colon to prefix the output. (not always, but often enough)

blightbow · 2023-10-16T20:06:35Z

I'm also noticing that roleplaying models that are aware of emoji syntax like :eyes: have an increased disposition toward beginning a response with a truncated one, like so:

eyes: This is an example.

It's like the model is internally perceiving a : character, which is in turn predisposing the output toward that emoji syntax.

cebtenzzre · 2023-10-16T20:11:35Z

I'm also noticing that roleplaying models that are aware of emoji syntax like :eyes: have an increased disposition toward beginning a response with a truncated one, like so:

You should open a new issue so this gets more attention.

blightbow · 2023-10-16T20:14:56Z

You should open a new issue so this gets more attention.

I'll wait and see if d9cbf44 has improved the situation. If it remains unfixed, I'll open a new ticket.

cebtenzzre · 2023-10-16T20:21:01Z

I'll wait and see if d9cbf44 has improved the situation.

That's a cherry-pick of this PR - are you testing on 1e0e873 (which includes it) or not?

blightbow · 2023-10-16T20:24:17Z

~~Sorry about that, I misread the activity. Yes, everything I have been discussing is current as of 1e0e873.~~

~~I will open a new ticket.~~

Edit: Issue was upstream. I can no longer reproduce any of the bugs that I've mentioned. If you are still seeing the issues and using a third-party UI for llama.cpp, please bring up the issue up with the developer and make sure they are aware of the need tweak their kv management code.

spencekim · 2023-10-31T05:04:40Z

I don't see the issue anymore using this repro: #3575 (comment)

Tried several hundred reqs, with zero appearances of the leading colons.

unfortunately I'm starting the see the bad outputs (colons) again on master

server : fix kv cache management

058e83c

ggerganov mentioned this pull request Oct 11, 2023

[Bug] Server completions return a lot of colons #3575

Closed

ggerganov merged commit 57dd55e into master Oct 12, 2023
34 of 39 checks passed

ggerganov deleted the fix-server-kv-cache-manage branch October 12, 2023 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : fix kv cache management #3588

server : fix kv cache management #3588

ggerganov commented Oct 11, 2023

KerfuffleV2 commented Oct 11, 2023

spencekim commented Oct 11, 2023

blightbow commented Oct 14, 2023

blightbow commented Oct 16, 2023 •

edited

Loading

cebtenzzre commented Oct 16, 2023

blightbow commented Oct 16, 2023

cebtenzzre commented Oct 16, 2023

blightbow commented Oct 16, 2023 •

edited

Loading

spencekim commented Oct 31, 2023

server : fix kv cache management #3588

server : fix kv cache management #3588

Conversation

ggerganov commented Oct 11, 2023

KerfuffleV2 commented Oct 11, 2023

spencekim commented Oct 11, 2023

blightbow commented Oct 14, 2023

blightbow commented Oct 16, 2023 • edited Loading

cebtenzzre commented Oct 16, 2023

blightbow commented Oct 16, 2023

cebtenzzre commented Oct 16, 2023

blightbow commented Oct 16, 2023 • edited Loading

spencekim commented Oct 31, 2023

blightbow commented Oct 16, 2023 •

edited

Loading

blightbow commented Oct 16, 2023 •

edited

Loading