[enhancement] reseed random number when loading from cache and --seed provided #1439

d-takemori · 2023-05-13T21:14:31Z

Current Behavior

When loading from a saved --prompt-cache file, llama.cpp appears to initialize response generation with the same state that was saved with the cache-file. This yields the same output every run.

Initial run

% ./main --model ../../Models/LLaMA/7B/ggml-model-q5_1.bin --prompt-cache test-cachefile -n 36 --color -p 'What happens with a test prompt?'
main: build = 537 (6456a4e)
main: seed = 1684011989
[snip]
main: attempting to load saved session from 'test-cachefile'
main: session file does not exist, will create
[snip]
What happens with a test prompt?
A test prompt is an assignment that helps you find out whether something works.
This can be anything from finding out if your code does what it should do or if the data

next run

% ./main --model ../../Models/LLaMA/7B/ggml-model-q5_1.bin --prompt-cache test-cachefile -n 36 --color -p 'What happens with a test prompt?'
main: build = 537 (6456a4e)
main: seed = 1684012061
[snip]
main: attempting to load saved session from 'test-cachefile'
main: loaded a session with prompt size of 8 tokens
main: session file has exact match for prompt!
[snip]
What happens with a test prompt?
A test prompt is an assignment that helps you find out whether something works.
This can be anything from finding out if your code does what it should do or if the data

another run, but with explicit --seed

% ./main --model ../../Models/LLaMA/7B/ggml-model-q5_1.bin --prompt-cache test-cachefile -n 36 --color -p 'What happens with a test prompt?' --seed 1234
main: build = 537 (6456a4e)
main: seed = 1234
[snip]
main: attempting to load saved session from 'test-cachefile'
main: loaded a session with prompt size of 8 tokens
main: session file has exact match for prompt!
[snip]
What happens with a test prompt?
A test prompt is an assignment that helps you find out whether something works.
This can be anything from finding out if your code does what it should do or if the data

Desired Behavior

If loading from a cache-file AND "--seed" # is given, llama.cpp should initialize the random number generator before response generation, so that different responses are generated.

If loading from a cache-file and "--seed #" not given, preserve current behavior.

The text was updated successfully, but these errors were encountered:

aleksusklim · 2023-05-24T06:31:32Z

I posted a comment about other strange behavior of prompt caching:
#1550 (comment)

UPD: reposted it as a separate issue: #1585

KerfuffleV2 mentioned this issue May 21, 2023

Some improvements to loading the session with --prompt-cache #1550

Merged

KerfuffleV2 closed this as completed in #1550 May 26, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement] reseed random number when loading from cache and --seed provided #1439

[enhancement] reseed random number when loading from cache and --seed provided #1439

d-takemori commented May 13, 2023

aleksusklim commented May 24, 2023 •

edited

Loading

[enhancement] reseed random number when loading from cache and --seed provided #1439

[enhancement] reseed random number when loading from cache and --seed provided #1439

Comments

d-takemori commented May 13, 2023

Current Behavior

Initial run

next run

another run, but with explicit --seed

Desired Behavior

aleksusklim commented May 24, 2023 • edited Loading

aleksusklim commented May 24, 2023 •

edited

Loading