Add `completion` server parameters to `v1/chat/completions` #4429

LiamNiisan · 2023-12-12T17:32:11Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

The same set of parameters should be available when calling from either completion or v1/chat/completions endpoints. Most notably min_p and grammar are useful to have.

A call like this should be possible for example:

curl http://localhost:3077/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{
"temperature": 1.0,
"min_p": 0.01,
"top_k": 0,
"top_p": 1,
"repeat_penalty": 1,
"grammar":  "root ::= (\"Hello!\" | \"Hi!\")",
"messages": [
{
    "role": "system",
    "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."
},
{
    "role": "user",
    "content": "Hi"
}
]
}'

Motivation

To be able to fully make use the llama.cpp backend, when replacing another LLM call that uses openai sdk for example, its useful to have access to the full set of parameters to tune the output for the task. It's possible to add those parameters as a dictionary using the extra_body input parameter when making a call using the python openai library.

If the parameters aren't available when making the switch, the dev will have to consider changing the code to use the completion endpoint instead, or even have separate versions of the same code to be able to compare different LLMs.

Possible Implementation

I'm guessing oaicompat_completion_params_parse function in examples/server/server.cpp can be used to add more parameters.

The text was updated successfully, but these errors were encountered:

Aspie96 · 2023-12-24T05:31:40Z

I second this.

I think grammar can be especially useful to force the model's answer to begin in a certain way. This can guide the model's answer to what to what the user desires.

Alternatively, if there is some reason for not providing grammar for v1/chat/completions, I think a new optional parameter start containing the beginning of the completion would be helpful (although that'd be a new issue).

peturparkur · 2023-12-30T03:23:41Z

grammar should already be supported since #4198 (approx 1 month ago)

peturparkur · 2023-12-30T03:37:48Z

preliminary PR to add parameters: #4694

github-actions · 2024-03-18T01:36:29Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2024-04-03T01:14:16Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

LiamNiisan added the enhancement New feature or request label Dec 12, 2023

github-actions bot added the stale label Mar 18, 2024

github-actions bot closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `completion` server parameters to `v1/chat/completions` #4429

Add `completion` server parameters to `v1/chat/completions` #4429

LiamNiisan commented Dec 12, 2023

Aspie96 commented Dec 24, 2023

peturparkur commented Dec 30, 2023

peturparkur commented Dec 30, 2023

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

Add completion server parameters to v1/chat/completions #4429

Add completion server parameters to v1/chat/completions #4429

Comments

LiamNiisan commented Dec 12, 2023

Prerequisites

Feature Description

Motivation

Possible Implementation

Aspie96 commented Dec 24, 2023

peturparkur commented Dec 30, 2023

peturparkur commented Dec 30, 2023

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

Add `completion` server parameters to `v1/chat/completions` #4429

Add `completion` server parameters to `v1/chat/completions` #4429