Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add completion server parameters to v1/chat/completions #4429

Closed
4 tasks done
LiamNiisan opened this issue Dec 12, 2023 · 5 comments
Closed
4 tasks done

Add completion server parameters to v1/chat/completions #4429

LiamNiisan opened this issue Dec 12, 2023 · 5 comments
Labels
enhancement New feature or request stale

Comments

@LiamNiisan
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

The same set of parameters should be available when calling from either completion or v1/chat/completions endpoints. Most notably min_p and grammar are useful to have.

A call like this should be possible for example:

curl http://localhost:3077/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{
"temperature": 1.0,
"min_p": 0.01,
"top_k": 0,
"top_p": 1,
"repeat_penalty": 1,
"grammar":  "root ::= (\"Hello!\" | \"Hi!\")",
"messages": [
{
    "role": "system",
    "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."
},
{
    "role": "user",
    "content": "Hi"
}
]
}'

Motivation

To be able to fully make use the llama.cpp backend, when replacing another LLM call that uses openai sdk for example, its useful to have access to the full set of parameters to tune the output for the task. It's possible to add those parameters as a dictionary using the extra_body input parameter when making a call using the python openai library.

If the parameters aren't available when making the switch, the dev will have to consider changing the code to use the completion endpoint instead, or even have separate versions of the same code to be able to compare different LLMs.

Possible Implementation

I'm guessing oaicompat_completion_params_parse function in examples/server/server.cpp can be used to add more parameters.

@LiamNiisan LiamNiisan added the enhancement New feature or request label Dec 12, 2023
@Aspie96
Copy link

Aspie96 commented Dec 24, 2023

I second this.

I think grammar can be especially useful to force the model's answer to begin in a certain way. This can guide the model's answer to what to what the user desires.

Alternatively, if there is some reason for not providing grammar for v1/chat/completions, I think a new optional parameter start containing the beginning of the completion would be helpful (although that'd be a new issue).

@peturparkur
Copy link

grammar should already be supported since #4198 (approx 1 month ago)

@peturparkur
Copy link

preliminary PR to add parameters: #4694

Copy link
Contributor

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Mar 18, 2024
Copy link
Contributor

github-actions bot commented Apr 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants