-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative for instruct mode (Vicuna support, etc.) #863
Conversation
set_console_color(con_st, CONSOLE_COLOR_USER_INPUT); | ||
fflush(stdout); | ||
break; | ||
} | ||
} | ||
if (!antiprompt.any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary code duplication.
I like this approach! One idea, I think it would be nice if configs had only one option per line. This would make future changes more easy to review. Since you already allow empty lines in the config files, we can group options by separating them with empty lines. No need to specify multiple options on one line. Also it would be nice for review purposes if we used only long options in the config files - i.e. Last point: it's fine that |
While good as an idea, input prefix/suffix and prompt/reverse prompt are not the same thing and should not be put together or it messes with the tokenization.
More info about this in the original --in-prefix PR #426 Basically this means that having prefix/suffix tokens and prefix/suffix strings means that you'd need 4 argument options. e: the functionality you're looking for could be easily implemented by simply replacing the hardcoded to be honest the current implementation of instruct mode ain't perfect and could definitely use a rework but the relation between tokens and strings need to be understood when implementing the change. |
@anzz1 I have tried to update a bit my implementation to revert changes to input-prefix. |
I have added also "multiline" mode, that does not send input, until EOF is sent (Ctrl+D on Linux/Mac, Ctrl+Z on Windows). I was also trying to look into something like Shift+Enter and Ctrl+V to keep multi-lines (it somehow worked for Windows, but was a bit complicated for other plaftorms). In the end, I have dropped it - it is just not worth hacking CLI terminal to support fancy features. It it would be probably best, if llama.cpp would work as server with http and rest api later (possibly with multiple agents running in parallel as it is possible with mmap approach now). |
@@ -25,7 +25,8 @@ static bool is_interacting = false; | |||
#if defined (__unix__) || (defined (__APPLE__) && defined (__MACH__)) || defined (_WIN32) | |||
void sigint_handler(int signo) { | |||
set_console_color(con_st, CONSOLE_COLOR_DEFAULT); | |||
printf("\n"); // this also force flush stdout. | |||
fflush(stdout); | |||
fflush(stderr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not needed, but it seemed better. Before I had sometimes colored terminal after killing llama.cpp. (Not sure if that is fixed by this though)
@@ -147,22 +150,20 @@ int main(int argc, char ** argv) { | |||
} | |||
|
|||
// number of tokens to keep when resetting context | |||
if (params.n_keep < 0 || params.n_keep > (int)embd_inp.size() || params.instruct) { | |||
if (params.n_keep < 0 || params.n_keep > (int)embd_inp.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, user now should explicitly call --keep -1
. This is added to configs, so it should not obstacle for new users.
Changed multiline mode to be toggled by sending |
…ate instruct mode, add stop prompt
Should we split main.cpp into two examples, one basic generation example that only takes a prompt and lets the LLM generate uninterrupted, and one chat example with all of this? I am afraid that main.cpp is becoming too complex and it is not really a good example for getting started with the library anymore. |
I think this is a great idea, but I'd probably save it after this PR is merged. No strong opinions here, though. |
It is really making the tests more easy with the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main.cpp
could use some refactoring like moving small parts of the code in utility functions.
Could also make a separate minimal
example just for text generation based on given prompt. But this can be done later
Can I merge it now? Or is there anything other I should address? @ggerganov |
I'm not sure what is going on but I can't seem to use basic interactive mode with a reverse prompt anymore. It refuses to give any output and continuously hands back control to me. Disabling multiline mode doesn't seem to change this. I'm thoroughly confused. This is me repeatedly pressing enter when it keeps giving me back input and I don't know why.
And even without the As a sanity test I compiled with cmake over make and still the same behavior and compile warnings. It seems to give one token of output, then gives me back control. Almost as if it thinks an antiprompt is being detected every token.
|
That's weird, we could probably revert this MR until it's solved. |
If needed, it's OK to revert. I won't be able to look at the examples code soon, so fully counting on the collaborators to assist with this |
Could we break this MR into smaller ones for each feature? A bug with one feature is preventing all the other features from landing. |
That is interesting, the following command worked well for me from the
I did modify
Perhaps a session with |
All of them. Vicuna, Vicuna 1.1, Alpaca, gpt4all, gpt4-x-alpaca, etc. They all dole out a token per line and give back control. However I think I'm on to something. When using the So two issues are here + a unforeseen one.
|
It looks like just a few really wrong shell scripts, in which the author falsely think a shell string
|
--config
(loading arguments from .txt file).--disable-multiline
(allows to input newlines - mode is toggled by sendingEOF
Ctrl+D on Linux/Mac, Ctrl+Z and Return on Windows)-ins
mode as deprecated. Now it could be replaced with user defined--ins-prefix
,--ins-suffix
.--stop-prompt
alternative to--reverse-prompt
- works same, but without adding instruct prefixes and suffixes (useful, when you're talking to agent and you want to force its response to something)--reverse-prompt
and--stop-prompt
, to stop, even when trailing space character is missing (because tokens usually starts with space character, and stop/reverse prompt won't trigger otherwise), enabled with--rm-trailing-space-workaround
There are some code duplicities, some todos and It's overall a bit messy.
But it should be working and could be merged, if you don't mind.