Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: embedding endpoint and openai compatibility with input as list #8887

Closed
gelim opened this issue Aug 6, 2024 · 3 comments
Closed

Bug: embedding endpoint and openai compatibility with input as list #8887

gelim opened this issue Aug 6, 2024 · 3 comments
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale

Comments

@gelim
Copy link
Contributor

gelim commented Aug 6, 2024

Hello,

When sending the following request (working well on OpenAI endpoints) Llama.cpp spit out an error.

Run the server:

$ llama-server -m bge-large-en-v1.5-334M-Q8_0.gguf  --host 127.0.0.1 --port 8080 --api-key xxx --embedding --embd-output-format json

Query:

POST /v1/embeddings HTTP/1.1
Host: 127.0.0.1:8080
Accept-Encoding: gzip, deflate
Connection: close
Accept: application/json
Content-Type: application/json
Authorization: Bearer xxxxx
X-Stainless-Async: false
Content-Length: 85

{"input": [[15339, 1917]], "model": "bge-large-en-v1.5", "encoding_format": "base64"}

Answer:

HTTP/1.1 400 Bad Request
Access-Control-Allow-Origin: 
Connection: close
Content-Length: 117
Content-Type: application/json; charset=utf-8
Server: llama.cpp

{"error":{"code":400,"message":"\"prompt\" must be a string or an array of integers","type":"invalid_request_error"}}

Issue: Llama.cpp is strictly taking a list of integers and does not allow list of list as openai-compatible clients sends out.

Name and Version

$ llama-server --version
version: 3486 (0832de7)
built with cc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

@gelim gelim added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels Aug 6, 2024
@gelim
Copy link
Contributor Author

gelim commented Aug 7, 2024

Naive attempt to fix the issue:

diff --git a/examples/server/server.cpp b/examples/server/server.cpp
index 7813a295..e9889594 100644
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@@ -969,6 +969,8 @@ struct server_context {
                 (prompt->is_array() &&  prompt->size() == 1 && prompt->at(0).is_string()) ||
                 (prompt->is_array() && !prompt->empty()     && prompt->at(0).is_number_integer())) {
                 slot.prompt = *prompt;
+           } else if (prompt->is_array() && prompt->size() == 1 && prompt->at(0).is_array()) {
+               slot.prompt = prompt->at(0);
             } else {
                 send_error(task, "\"prompt\" must be a string or an array of integers", ERROR_TYPE_INVALID_REQUEST);
                 return false;

@ggerganov
Copy link
Owner

PR welcome

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale
Projects
None yet
Development

No branches or pull requests

2 participants