You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
Use llama_chat_apply_template on /completion or a new endpoint (e.g. /chat), in addition to the current OpenAI compatibility endpoints. Update WebUI to reflect the change.
Motivation
The OpenAI compatibility endpoints are nice and all, but native endpoints offer functionalities specific to llama.cpp (e.g. mirostat, slot management, etc). One exception is automatically applying chat templates, which has been introduced to OpenAI compatibility endpoints in #5593, while the native endpoint (/completion) still uses the old prompt/antiprompt formatting method and requires the user to provide correctly formatted prompts. This is especially a problem for WebUI users, and there has been many issue or discussion threads about worse-than-expected chat results due to incorrect templates. It would thus be great to introduce server-side support for chat templates on native endpoints.
/completion works well as a text completion endpoint (?), thus in order not to break stuffs too much, maybe we can consider adding a new endpoint (/chat) with the changes. WebUI chat page should use the new endpoint instead. "Prompt template" and "Chat history template" options are thus obselete, and could be removed or moved under "More options".
The text was updated successfully, but these errors were encountered:
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Use
llama_chat_apply_template
on/completion
or a new endpoint (e.g./chat
), in addition to the current OpenAI compatibility endpoints. Update WebUI to reflect the change.Motivation
The OpenAI compatibility endpoints are nice and all, but native endpoints offer functionalities specific to llama.cpp (e.g. mirostat, slot management, etc). One exception is automatically applying chat templates, which has been introduced to OpenAI compatibility endpoints in #5593, while the native endpoint (
/completion
) still uses the old prompt/antiprompt formatting method and requires the user to provide correctly formatted prompts. This is especially a problem for WebUI users, and there has been many issue or discussion threads about worse-than-expected chat results due to incorrect templates. It would thus be great to introduce server-side support for chat templates on native endpoints.Related: #5447
Refs on antiprompts being old and obselete: #6378 (review) #6391 (comment)
Possible Implementation
/completion
works well as a text completion endpoint (?), thus in order not to break stuffs too much, maybe we can consider adding a new endpoint (/chat
) with the changes. WebUI chat page should use the new endpoint instead. "Prompt template" and "Chat history template" options are thus obselete, and could be removed or moved under "More options".The text was updated successfully, but these errors were encountered: