add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS #2304

jxy · 2023-07-21T03:15:07Z

The BOS precedes the string specified by --in-prefix. Model generated EOS is now kept in the context.

It provides a way to strictly following the prompt format used in Llama-2-chat.

The EOS handling also benefits some existing finetunes that uses EOS to mark the end of turn.

The BOS precedes the string specified by `--in-prefix`. Model generated EOS is now kept in the context. It provides a way to strictly following the prompt format used in Llama-2-chat. The EOS handling also benefits some existing finetunes that uses EOS to mark the end of turn.

jxy · 2023-07-21T03:18:38Z

For llama-2-chat, you want

$ ./main -m "$MODEL" -c 4096 -n -1 \
--in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -p \
"[INST] <<SYS>>
$SYSTEM
<</SYS>>

$instruct [/INST]"

The spaces in in-prefix/suffix are important.

bullno1 · 2023-07-21T04:10:53Z

Currently prompt already has an implicit BOS right?

ggerganov · 2023-07-21T12:24:29Z

@bullno1

llama_tokenize() has a bool flag whether to add a BOS or not. In main we set the flag to true when tokenizing the prompt

ggerganov

The change seems good, but maybe do some more testing by more people to be sure we didn't mess up the main loop somehow

examples/common.h

ghost · 2023-07-21T14:28:01Z

For llama-2-chat, you want

$ ./main -m "$MODEL" -c 4096 -n -1 \
--in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -p \
"[INST] <<SYS>>
$SYSTEM
<</SYS>>

$instruct [/INST]"

The spaces in in-prefix/suffix are important.

I'm trying, but my command line does not accept the format:

I try to clean it, but then format is changed:

I dunno how to force a new line. Also, am I to expected replace $SYSTEM, and $instruct or leave it alone?

Edit: Instead of -p, I used -f and loaded a .txt:

./main -m ~/llama-2-7b-chat.ggmlv3.q4_0.bin -c 2048 -n -1 -t 3 -b 7 --in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -f ~/storage/downloads/Llama2.txt

main: build = 857 (47031e4)
main: seed  = 1689957008
llama.cpp: loading model from /data/data/com.termux/files/home/llama-2-7b-chat.ggmlv3.q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: freq_base  = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.08 MB
llama_model_load_internal: mem required  = 5287.72 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size  = 1024.00 MB

system_info: n_threads = 3 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
Input prefix with BOS
Input prefix: ' [INST] '
Input suffix: ' [/INST]'
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 7, n_predict = -1, n_keep = 0


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

 "[INST] <<SYS>>
You're an A.I.
<</SYS>>

Please list 3 movies with Mel Gibson [/INST]"When you say 'Mel Gibson', I immediately think of his iconic roles in these three movies:
1. Mad Max (1979) - In this cult classic, Gibson plays the titular character, a rugged and violent anti-hero who must navigate a post-apocalyptic world filled with danger and mayhem.
2. Lethal Weapon (1987) - In this buddy cop comedy-action film, Gibson stars as Martin Riggs, a reckless and unpredictable detective who teams up with a straight-laced partner (Danny Glover) to take down a drug lord.
3. Braveheart (1995) - In this epic historical drama, Gibson gives a tour de force performance as William Wallace, a Scottish warrior who leads a rebellion against English rule in the late 13th century. The film's intense battle scenes and emotional dramatics have made it a fan favorite for decades."
 [INST] What's 2 fun things to do at the beach?
 [/INST]  Sure! Here are 2 fun things to do at the beach:
1. Build Sandcastles: Building sandcastles is a classic beach activity that can be enjoyed by people of all ages. You can use buckets, shovels, and other tools to create your masterpiece. Don't forget to add some decorations like seashells, rocks, or even small toys to make it more interesting.
2. Go Swimming: Swimming is another popular beach activity that provides a great way to cool off and have fun in the sun. You can swim laps, play games like "Marco Polo" or "Sharks and Minnows," or simply splash around and enjoy the water. Just remember to always swim in designated areas and follow safety guidelines to avoid any accidents.
 [INST]

jxy · 2023-07-22T02:41:29Z

@JackJollimore you're missing the backslashes \ at the end of the line in your shell (and somehow extra newlines???), and having extra quotation marks " in your prompt file.

arch-btw · 2023-07-22T06:01:54Z

Unfortunately it doesn't work for me either. It starts asking itself questions and then answers them.

ghost · 2023-07-22T09:18:14Z

you're missing the backslashes \ at the end of the line in your shell (and somehow extra newlines???), and having extra quotation marks " in your prompt file.

I overlooked the quotation after converting to .txt. Thanks for pointing that out, but I copy/pasted, so if there's new lines then the shell did that. Termux ignored commands after backslash, and I deleted them to make it work.

I don't have a clear understanding of this PR, so maybe someone skilled will test.

arch-btw · 2023-07-22T18:58:02Z

Just following up that it works for me now, I made a few user errors and also ran into the copy + paste problem. It's all solved now. 👍

zacps · 2023-07-24T04:36:17Z

Another 👍, working well for me as well on llama2-70b-chat (fp16) after rebasing on master.

lionelchg · 2023-07-24T22:32:19Z

Works for me as well! I have rewritten a small bash script that is copy/pastable:

#!/bin/bash

# The script should be launched like ./chat.sh models/llama-2-13b-chat.ggmlv3.q4_0.bin system_prompts/translation.txt Hello

# Load system prompt
SYSTEM_PROMPT=$(cat $2)

# Execute model
./main -m $1 -c 4096 -n -1 --in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i \
    -p "[INST] <<SYS>>\n$SYSTEM_PROMPT\n<</SYS>>\n\n$3 [/INST]"

with the following content for system_prompts/translation.txt:

Translate every sentence from English into French.

ggerganov · 2023-07-25T12:19:43Z

@lionelchg Would be a nice contribution to examples - feel free to PR it

ejones · 2023-07-26T00:57:19Z

This is great! FYI as an additional benefit, this unblocks using --grammar in interactive mode as an alternative for several prompt options like --input-prefix, --input-suffix and --reverse-prompt (except for the unnecessary newline):

$ ./main -m $LLAMA2_13B_Q4_0  -i  \
  --grammar 'root ::= "### RESPONSE: *" [a-z]+ (" " [a-z]+)* "* " [^\r\n]+ "\n### HUMAN: "' \
  -p "### HUMAN: Hello, how are you?
"
...

 ### HUMAN: Hello, how are you?
### RESPONSE: *thinks for a minute* I think I'm fine.
### HUMAN: 
great!
### RESPONSE: *says something else* I have this thing that annoys me.
### HUMAN:

Builds on top of PR ggerganov#2304 to create a working script for system prompt integration with interactive mode.

ghost · 2023-07-27T13:35:55Z

@jxy It appears Llama2 is the only model working as expected since this commit.

Is there something I need to do to have models other than Llama2 follow intended prompt structure?

#2417

pugzly · 2023-07-27T15:33:38Z

Yes, I couldn't get one of my older models (wizardlm) working with the latest llama.cpp
Until I tried manually downloading repo one commit prior to this one, and everything start working as it used to be.

jxy · 2023-07-28T02:16:12Z

Sorry for breaking people's command line with reverse prompt. Previously if you specified reverse prompt, and the model generated an EOS, the EOS is replaced by a new line and the first reverse prompt was inserted. That was a bit unintuitive and overlapped with in-prefix.

Now with this PR, if the model generates EOS, the EOS is kept in the context, and NO reverse prompt would be inserted automatically. In order to have some text prefix your input, use --in-prefix as intended. As an example, for Vicuna, previously you might have

-r 'USER:' --in-prefix ' '

after this PR, you only need

--in-prefix 'USER: '

which is because Vicuna is capable of generating EOS. In fact, the latter worked before this PR too. Same things for Wizarldlm or other model that uses EOS to signal the end of generation.

…ep EOS (ggerganov#2304)" This reverts commit 0c06204.

Reference: ggerganov/llama.cpp#2304 :)

jxy mentioned this pull request Jul 21, 2023

Add llama 2 model #2262

Closed

ggerganov reviewed Jul 21, 2023

View reviewed changes

examples/common.h Outdated Show resolved Hide resolved

jxy added 2 commits July 24, 2023 21:48

examples/common: move input_prefix_bos to other bools

11d2405

Merge remote-tracking branch 'origin/master' into prefix-bos

b128599

ggerganov mentioned this pull request Jul 25, 2023

GGML model showing noticeable quality issues when compared to HF model #2354

Closed

ggerganov approved these changes Jul 25, 2023

View reviewed changes

ggerganov merged commit 0c06204 into ggerganov:master Jul 25, 2023

ejones mentioned this pull request Jul 26, 2023

--grammar usage #2364

Closed

lionelchg pushed a commit to lionelchg/llama.cpp that referenced this pull request Jul 26, 2023

Create bash script for LlaMa 2 Chat models

cddfec9

Builds on top of PR ggerganov#2304 to create a working script for system prompt integration with interactive mode.

lionelchg mentioned this pull request Jul 26, 2023

Create bash script for LlaMa 2 Chat models #2406

Closed

lionelchg added a commit to lionelchg/llama.cpp that referenced this pull request Jul 26, 2023

Create example bash script for LlaMa 2 Chat

e9c1703

Builds on top of PR ggerganov#2304 to create a working script for system prompt integration with interactive mode.

lionelchg mentioned this pull request Jul 26, 2023

Create example bash script for LlaMa 2 Chat #2408

Open

ghost mentioned this pull request Jul 27, 2023

Prompt structure after the --in-prefix-bos commit #2417

Closed

aragula12 pushed a commit to aragula12/llama.cpp that referenced this pull request Aug 4, 2023

Revert "main : add --in-prefix-bos to prefix BOS to user inputs; ke…

8070def

…ep EOS (ggerganov#2304)" This reverts commit 0c06204.

ggerganov mentioned this pull request Aug 21, 2023

main : restore old EOS behavior in interactive mode #2689

Closed

jxy deleted the prefix-bos branch April 10, 2024 02:46

ajs177 pushed a commit to ajs177/LLAMA-summarizer that referenced this pull request May 10, 2024

Add updated llama.cpp example

1a7cc35

Reference: ggerganov/llama.cpp#2304 :)

pminev mentioned this pull request Aug 15, 2024

llama: why is input_prefix_bos user-provided in main example? alpaca-core/ac-local#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS #2304

add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS #2304

jxy commented Jul 21, 2023

jxy commented Jul 21, 2023

bullno1 commented Jul 21, 2023

ggerganov commented Jul 21, 2023

ggerganov left a comment

ghost commented Jul 21, 2023 •

edited by ghost

Loading

jxy commented Jul 22, 2023

arch-btw commented Jul 22, 2023

ghost commented Jul 22, 2023

arch-btw commented Jul 22, 2023

zacps commented Jul 24, 2023

lionelchg commented Jul 24, 2023 •

edited

Loading

ggerganov commented Jul 25, 2023

ejones commented Jul 26, 2023

ghost commented Jul 27, 2023 •

edited by ghost

Loading

pugzly commented Jul 27, 2023

jxy commented Jul 28, 2023 •

edited

Loading

add --in-prefix-bos to prefix BOS to user inputs; keep EOS #2304

add --in-prefix-bos to prefix BOS to user inputs; keep EOS #2304

Conversation

jxy commented Jul 21, 2023

jxy commented Jul 21, 2023

bullno1 commented Jul 21, 2023

ggerganov commented Jul 21, 2023

ggerganov left a comment

Choose a reason for hiding this comment

ghost commented Jul 21, 2023 • edited by ghost Loading

jxy commented Jul 22, 2023

arch-btw commented Jul 22, 2023

ghost commented Jul 22, 2023

arch-btw commented Jul 22, 2023

zacps commented Jul 24, 2023

lionelchg commented Jul 24, 2023 • edited Loading

ggerganov commented Jul 25, 2023

ejones commented Jul 26, 2023

ghost commented Jul 27, 2023 • edited by ghost Loading

pugzly commented Jul 27, 2023

jxy commented Jul 28, 2023 • edited Loading

add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS #2304

add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS #2304

ghost commented Jul 21, 2023 •

edited by ghost

Loading

lionelchg commented Jul 24, 2023 •

edited

Loading

ghost commented Jul 27, 2023 •

edited by ghost

Loading

jxy commented Jul 28, 2023 •

edited

Loading