-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression in interactive mode #2507
Comments
I don't use v2 models because llama.cpp will not work as expected since, I revert to an older commit with Wizard-Vicuna-7B.ggmlv3.q4_0.bin and the problems are gone. Related: #2417 |
@JackJollimore Thanks for pointing me to the previous comments on the change. I forked and reverted the input-bos change - resolves the issue for me https://github.com/aragula12/llama.cpp |
@aragula12 You need |
@aragula12 Awesome! I tried it out and it's working as expected. |
More testing with, This makes a conversation impossible as |
@JackJollimore Do you have an example, preferably with |
@jxy Sure, it's reproducable with many models. Here's 3 Examples: Here's the content of Vic.txt:
Example #1:
I expect llama.cpp to stop and let me input after, No chance to type until Ctrl + C in Example #2 with
Example #3 without
llama.cpp is inconsistent. |
Vicuna uses EOS to signal end of turn so you should not use Vicuna uses USER and ASSISTANT. Its template is here, https://github.com/lm-sys/FastChat/blob/3dc91c522e1ed82b6f24cb9866d8d9c06ff28d7b/docs/vicuna_weights_version.md?plain=1#L25-L33 |
To clarify, it's my error because of casing, i.e. Assuming that's true then it's still worse because I don't refer to myself or the model as, Edit: There's no way to use a model like Vicuna without calling myself, Llama.cpp went from, "generally follow a prompt template" to "use an exact prompt template or else". How dare I change, Oh well! |
@JackJollimore Use |
Thank you @jxy, but that's very confusing as 7 days ago you said the EXACT opposite: #2507 (comment) Now, I'm supposed to use, Here's another example:
Assistant generated infinite spacebar presses, never ending, so I had to CRTL + C. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I am experiencing a change in llama cpp behavior due to 0c06204 by @jxy
Llama stop producing output abruptly. Many a time it goes into prompt mode without producing any output and some time it just outputs a few lines.
Prior to this change I used to get several paragraphs of output.
Command-line:
./main --top_k 0 --top_p 0.73 --color --multiline-input -i -n -1 --repeat-last-n -1 --no-penalize-nl --keep -1 --temp 1.7 --interactive-first -c 4096 -m chronos-13b-v2.ggmlv3.q8_0.bin
Sample Input Text:
Populations rarely (if ever) exist in isolation.
In reality, the growth rate of a given population depends not only on itself, but also on other populations that it interacts with either directly or indirectly.
Such interactions lead to a range of ecological relationships, including competition for resources, predation, mutualism, parasitism and more besides
The text was updated successfully, but these errors were encountered: