-
Notifications
You must be signed in to change notification settings - Fork 11.4k
How does "-ins"/Instruct/chat mode remembers contexts? How to properly feed it back as plain text? #1651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Maybe the
Oh, at least it's blue!
And what if we…
Well, I don't know. It is still unclear to me…
|
It seems like it inserts the beginning-of-string token in front of The It probably depends on the model but I think it affects the outcome a lot. |
I've noticed the same issue and in my experience -ins sets default n_keep to 2 even if set to --keep -1 or 0. -p does not have this problem. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I had a feeling that sometimes certain models are good at keeping context in chat/instruct conversation, but other models are bad.
To be sure that it is a quality of the model, but not llama.cpp's chat mode of main.exe, I tested the same Instruction template in koboldcpp.
To my surprise, it followed the conversation much better than my attempts in llama.cpp. I tried to find a clear example, where
-ins
breaks, but it was hard. It is not like the model have no memory at all, it more looks like it does not "respect" it much, having greater possibility to treat next instruction as a separate unrelated question.Then I tried to put my dialogue to
-f in.txt
and continue it there with main.exe. It worked good! But it did not yield to the same results as in chat/instruct mode, even with zero temperature and the same seed.Example command: (version
llama-master-66874d4-bin-win-avx2-x64
)main.exe -t 8 -c 2048 --temp 0 --top-p 0 --top-k 1 --seed 1 -m WizardLM-7B-uncensored.ggml.q5_1.bin --ins
(looks like
-i
,--keep -1
,-r "###"
did not improve anything)Chat that shows the issue:
It seems to forgot the info from two prompt back (but often remembers info from directly preceding instruction).
When I tried to
main.exe -t 8 -c 2048 --temp 0 --top-p 0 --top-k 1 --seed 1 -m WizardLM-7B-uncensored.ggml.q5_1.bin -f in.txt
If gave completely different (and kinda incorrect) response:
Above was with newlines at the beginning of the document, without them it would be:
Here is without linefeeds at all:
Anyway, if I craft my initial chat as document:
It spits back a large list:
It quickly became incorrect, but at least the model understood the question and used conversation history in its answer. Which is often not the case for
-ins
mode.However, Instruct chats clearly have memory too:
If I omit "of them" it becomes:
But so does
-f in.txt
version too, it outputs:Still, the chat-mode has much stronger tendency to not respect conversation history!
I cannot prove it, especially because it is different between models. But if anybody would explain, what exactly should I type in the RAW prompt to get the exact answers as I get with
-ins
– then I could experiment further and conclude why this is happens, and whether chat mode is buggy or not.The text was updated successfully, but these errors were encountered: