-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update main's interactive mode to use the chat handshake templates support already available in llama.cpp (and currently only used by server,...) #6795
Conversation
user needs to pass --chaton TEMPLATE_ID TEMPLATE_ID will be one of the predefined chat templates already in llama.cpp's llama_chat_apply_template_internal and related like chatml, llama2, llama3, ...
Helper to return reverse prompts needed for a given chat template A wrapper that will allow wrapping a given message within a tagged chat template based on the role and chat template specified.
Glanced through existing interactive and chatml flow, to incorporate this flow. Need to look deeper later. NOTE: Till this point is reapplying of my initial go at chaton, by simplifying the amount of change done to existing code, a bitmore.
This is a commit with dbug messages. ChatApplyTemplateSimple * wasnt handling unknown template ids properly, this is identified now and a warning logged, rather than trying to work with len of -1. Need to change to quit later. * Also avoid wrapping in a vector, as only a single message can be tagged wrt chat handshake template. ReversePrompt Add support for llama2
Cleanup the associated log messages. Dont overload the return for status as well as data. Now the data returned if any is kept independent of the status of the operation. On failure log a message and exit.
Avoid the use of the seperate vector, which inturn is copied to the main vector on return. Now directly pass the main reverse prompt vector and inturn directly add to passed vector. Also keep data and return status seperate. Explicitly identify a unknown template_id situation and return failure status.
Adding this attached patch to this PR, allows me to chat with llama3 also using main -i --chaton llama3 |
This sounds like an excellent and much needed addition to main. |
I've made a detailed research on the same subject, so I strongly recommend you to refer to this issue: #6391 Also, a new function named |
In interactive mode (ie -i) any prompt file (-f) or prompt (-p) passed using the command line argument is treated as a system prompt and inturn this PR formats it to match the system prompt template expected. |
Here's the patch running llama3 with --verbose-prompt. I think there might be too many new lines?
Without --verbose-prompt:
|
There is a new PR, which is again a experiment which tries to use a simple minded json file to try and drive the logic, so that many aspects can be controlled by editing the json file, rather than needed to update the code. |
Currently the interactive mode of main doesnt add any tags to identify system or user messages to the model, by default.
One will have to either
This PR tries to add a generic chat mode to main, which can make use of any chat templates already added to llama_chat_apply_template_internal, which is currently used by server logic, but not main logic.
To help with the same a new chaton.hpp file is added to common, which contains
To add new chat handshake templates remember to add needed logic to
To use this support pass -i and --chaton TEMPLATE_ID to main.
Currently supported templates is chatml and llama2, for other chat handshake template standards already support by chat_apply_template_internal, suitable reverse prompts need to be added to llama_chat_reverse_prompt.