Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Use llama_chat_apply_template on /completion endpoint #6624

Closed
4 tasks done
EZForever opened this issue Apr 12, 2024 · 3 comments
Closed
4 tasks done

server: Use llama_chat_apply_template on /completion endpoint #6624

EZForever opened this issue Apr 12, 2024 · 3 comments
Labels

Comments

@EZForever
Copy link
Contributor

EZForever commented Apr 12, 2024

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

Use llama_chat_apply_template on /completion or a new endpoint (e.g. /chat), in addition to the current OpenAI compatibility endpoints. Update WebUI to reflect the change.

Motivation

The OpenAI compatibility endpoints are nice and all, but native endpoints offer functionalities specific to llama.cpp (e.g. mirostat, slot management, etc). One exception is automatically applying chat templates, which has been introduced to OpenAI compatibility endpoints in #5593, while the native endpoint (/completion) still uses the old prompt/antiprompt formatting method and requires the user to provide correctly formatted prompts. This is especially a problem for WebUI users, and there has been many issue or discussion threads about worse-than-expected chat results due to incorrect templates. It would thus be great to introduce server-side support for chat templates on native endpoints.

Related: #5447

Refs on antiprompts being old and obselete: #6378 (review) #6391 (comment)

Possible Implementation

/completion works well as a text completion endpoint (?), thus in order not to break stuffs too much, maybe we can consider adding a new endpoint (/chat) with the changes. WebUI chat page should use the new endpoint instead. "Prompt template" and "Chat history template" options are thus obselete, and could be removed or moved under "More options".

@EZForever EZForever added the enhancement New feature or request label Apr 12, 2024
@github-actions github-actions bot added the stale label May 13, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@okigan
Copy link
Contributor

okigan commented Jul 31, 2024

@EZForever is this still and issue or was it resolved ?

@EZForever
Copy link
Contributor Author

Yes, this issue is still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants