Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: '--in-prefix STRING' option #426

Merged
merged 1 commit into from
Mar 25, 2023
Merged

feat: '--in-prefix STRING' option #426

merged 1 commit into from
Mar 25, 2023

Conversation

anzz1
Copy link
Contributor

@anzz1 anzz1 commented Mar 23, 2023

--in-prefix STRING command line option prefixes user inputs with STRING

For example, chatting with bob:
./main -m ./models/llama-13B-ggml/ggml-model-q4_0.bin -n 256 --repeat_penalty 1.0 -f ./prompts/chat-with-bob.txt -i -r "User:" --in-prefix " "
adds a space after the reverse prompt "User:"

So instead of

Bob: How can I help you?
User:_

its

Bob: How can I help you?
User: _

and matches the original prompt better.

It could be useful for other prompts too, alignment or maybe testing multiple similar questions like "What do you think about X" or whatever.

Prefix user inputs with a string
Copy link
Collaborator

@Green-Sky Green-Sky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally i don't see much value in this change.
for your case specifically you could just add the space to the reverse prompt Bob: -> Bob:

@x02Sylvie
Copy link

personally i don't see much value in this change. for your case specifically you could just add the space to the reverse prompt "Bob:" -> Bob: "

reverse prompt with extra space seems to not work for me atleast, llama.cpp goes on as if there was no reverse prompt in that case

@Green-Sky
Copy link
Collaborator

reverse prompt with extra space seems to not work for me atleast, llama.cpp goes on as if there was no reverse prompt in that case

that sounds like a bug

@anzz1
Copy link
Contributor Author

anzz1 commented Mar 23, 2023

reverse prompt with extra space seems to not work for me atleast, llama.cpp goes on as if there was no reverse prompt in that case

that sounds like a bug

it's because the reverse prompt only test for last output. "user:" and " " are two different tokens, so it doesn't work. idk if it should be changed though.

in any case there is different value in this, you would not want to use -r "User:" to -r "User: " even if it worked. because you would insert two tokens as reverse prompts "user:" and " " then. you do not want that, you want to have the space as part of user input. because then if user types "Hey" the tokens can be "user:" and " Hey" or also "user:" and " " and "Hey". but if you forced a space token there it would remove an option. this is important distinction. that is why you want to use a -r "User:" and --in-prefix " " as separate parameters

@Green-Sky
Copy link
Collaborator

it's because the reverse prompt only test for last output. "user:" and " " are two different tokens, so it doesn't work. idk if it should be changed though.

actually i think, it is because the space is part of the next token, so there is no tailing space to catch...

@anzz1
Copy link
Contributor Author

anzz1 commented Mar 24, 2023

actually i think, it is because the space is part of the next token, so there is no tailing space to catch...

True, the space after "user:" can be either a token of its' own or part of the next token. The reverse prompt code should be fixed to check more than the last output so that it can match even when the reverse prompt spans multiple tokens.

Also noticed another issue with it: main.cpp#L435 that the antiprompt functions should be wrapped in a antiprompt.empty() check as currently that function runs even if reverse prompt is not used.

Anyway we are getting derailed here since the point I'm trying to make here is that this functionality is not related to reverse prompt, it was just an usage example.

This is simply preinjecting any text to each user input which can be used to build various new interactions. It can be used with or without reverse prompts.

@blackhole89
Copy link
Contributor

@anzz1 The reverse prompt can span multiple tokens. However, there is no way for it to interrupt generation mid-token. (That's why I opted to use token vectors rather than strings for reverse prompts when I first wrote interactive mode) Therefore, the best thing you could hope for is that if, say, the generation outputs tokens amounting to "User:", " Hello", control is passed to the user after that because "User: " was in the output. You will not, however, be able to get rid of the "Hello". This would only be possible if we had some means to roll back generation (which is bound to be computationally expensive either way).

The PR's idea seems like the best we can do to me if we want to simultaneously (1) always correctly detect when reverse prompts of form "Name:" are emitted, (2) not force the user to have to enter the space after that manually, (3) not have the reverse prompt be followed by one model-imposed word like the "Hello" in the above example and (4) don't want to implement generation rollback.

@anzz1
Copy link
Contributor Author

anzz1 commented Mar 25, 2023

The reverse prompt can span multiple tokens. However, there is no way for it to interrupt generation mid-token. (That's why I opted to use token vectors rather than strings for reverse prompts when I first wrote interactive mode) Therefore, the best thing you could hope for is that if, say, the generation outputs tokens amounting to "User:", " Hello", control is passed to the user after that because "User: " was in the output. You will not, however, be able to get rid of the "Hello". This would only be possible if we had some means to roll back generation (which is bound to be computationally expensive either way).

Thanks for putting into words how it works better than I could. Yes, implementing rollback doesn't pass cost-benefit analysis.

However, it might be a good idea to put in the backlog of scanning the text output between last interaction and now (not only between last token and now) after generating a token to scan whether the reverse prompt was found as the computation required is insignificant. So like you said, a -r "User: " would stop generating after "User:", " whatever" instead of going on like it does now. This would need an additional text buffer to be added to the state, which idk if it's worth at least for now to add such a thing. After all, we're trying to keep things lean here, right? :)

The PR's idea seems like the best we can do to me if we want to simultaneously (1) always correctly detect when reverse prompts of form "Name:" are emitted, (2) not force the user to have to enter the space after that manually, (3) not have the reverse prompt be followed by one model-imposed word like the "Hello" in the above example and (4) don't want to implement generation rollback.

My communication on what this aims to achieve was less than stellar. This is exactly what I was going for, I just couldn't put it into words properly. Can add whatever to the output with basically zero cost.

In the future with sliding context window (and infinite generation that can come with it), it could be great for testing things like "Please continue" and mashing enter without having to type it out.

@anzz1 anzz1 merged commit fbd4d38 into master Mar 25, 2023
@anzz1 anzz1 deleted the feat-input-prefix branch March 25, 2023 12:03
@Green-Sky
Copy link
Collaborator

Green-Sky commented Mar 25, 2023

Yes, implementing rollback doesn't pass cost-benefit analysis.

should be "pretty cheap". you just need to track the tokenindex for each char.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants