-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server: use llama_chat_apply_template #5593
Conversation
examples/server/server.cpp
Outdated
@@ -2390,12 +2391,13 @@ static void server_params_parse(int argc, char **argv, server_params &sparams, | |||
break; | |||
} | |||
std::string value(argv[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value
seems unused now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's unused, I forgot to remove that. It's now removed
examples/server/utils.hpp
Outdated
std::ostringstream output; | ||
bool is_inside_turn = false; | ||
// Check if the template supplied via "--chat-template" is supported or not. Returns true if it's valid | ||
inline bool verify_custom_template(std::string tmpl) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline bool verify_custom_template(std::string tmpl) { | |
inline bool verify_custom_template(const std::string & tmpl) { |
examples/server/utils.hpp
Outdated
for (size_t i = 0; i < messages.size(); ++i) { | ||
auto &curr_msg = messages[i]; | ||
str[i] = json_value(curr_msg, "role", std::string("")); | ||
str[i + 1] = json_value(curr_msg, "content", std::string("")); | ||
alloc_size += str[i + 1].length(); | ||
chat[i].role = str[i].c_str(); | ||
chat[i].content = str[i + 1].c_str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a bug here. Maybe change to str[2*i + 0] = ...
and str[2*i + 1] = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for notice that. That's why I noticed that the bot's response is quite weird when I test this PR yesterday.
Fixed on c53b34d
Looking at the debug log, I can confirm that the formatted chat is correct:
{"timestamp":1708423580,"level":"VERBOSE","function":"format_chat","line":208,"message":"formatted_chat","text":"<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nhi, how are you<|im_end|>\n<|im_start|>assistant\n"}
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
I still spot a weird bug where the chat is formatted correctly, but then
Edit: found it! I forgot to Works fine now (tested with both chatml + llama2 template) |
Since this is a breaking change, it'd be good to update the server README to mention that the Also, I found the following message vague. What are the "common" templates?
|
@ibehnam Yeah I forgot about the doc. You're right, in fact, I was thinking about how to make it clear that which templates we support when showing this help, but the problem is that it's depends on My idea is that we can add a section in server's doc that show how to use the |
* server: use llama_chat_apply_template * server: remove trailing space * server: fix format_chat * server: fix help message Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server: fix formatted_chat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server: use llama_chat_apply_template * server: remove trailing space * server: fix format_chat * server: fix help message Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server: fix formatted_chat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Closes #5575
This PR replaces the usage
--chat-template
introduced #5425 . This parameter now accepts a jinja template instead of type name.If
--chat-template
is not specified, the default template (taken from model metadata) will be used instead.This PR also fix the issue where
llama_chat_apply_template
does not read the metadata correctly.CC @ggerganov and @cebtenzzre for review. Thank you!