Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Completion of pre-tokenized prompt is broken #4476

Closed
shibe2 opened this issue Dec 14, 2023 · 5 comments
Closed

server: Completion of pre-tokenized prompt is broken #4476

shibe2 opened this issue Dec 14, 2023 · 5 comments
Labels
bug Something isn't working stale

Comments

@shibe2
Copy link
Contributor

shibe2 commented Dec 14, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

According to documentation:

`prompt`: Provide the prompt for this completion as a string or as an array of strings or numbers representing tokens. Internally, the prompt is compared to the previous completion and only the "unseen" suffix is evaluated. If the prompt is a string or an array with the first element given as a string, a `bos` token is inserted in the front like `main` does.

or as an array of strings or numbers representing tokens

Current Behavior

When supplying the prompt as array of token identifiers, it instead calls split_multiprompt_task and the request hangs.

Steps to Reproduce

  1. Call /tokenize with a text prompt in content.
  2. Add BOS if needed.
  3. Call /completion with the resulting array in prompt.

Failure Logs

all slots are idle and system prompt is empty, clear the KV cache

slot 0 is processing [task id: 2]
slot unavailable

print_timings: prompt eval time = 0.00 ms / 0 tokens ( -nan ms per token, -nan tokens per second)
print_timings: eval time = -94366367288.92 ms / 0 runs ( -inf ms per token, -0.00 tokens per second)
print_timings: total time = -94366367288.92 ms
slot unavailable

@jxy
Copy link
Contributor

jxy commented Dec 21, 2023

It took me sometime to realize what's wrong with my server.

I originally added the prompt array support for testing prompts with specifically selected tokens, which has been quite useful, as the prompts subject no constraint of any tokenizer.

In order to support both usage, how about allowing 2-level nested prompts? For example,

{
"prompt": [
    "First prompt",
    [1, "second prompt provided in an array.", 2, 1, "What do you think?"],
    "Third prompt"
]
}

This would be compatible with the multi-prompt change already introduced, and allow for array prompts.

@jxy
Copy link
Contributor

jxy commented Dec 22, 2023

I'm baffled. I can't get the current multi-prompt to work. #4583 I'll wait for that to be fixed and introduce new behaviors. For now, using #4232 (comment) to get back the previous behavior.

@shibe2
Copy link
Contributor Author

shibe2 commented Dec 22, 2023

I would like it better if multi-prompt field was called "prompts". It can then have sub-arrays as in your example, @jxy. The format of "prompt" field can be made to match the current documentation, i.e. single prompt.

Copy link
Contributor

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Mar 18, 2024
Copy link
Contributor

github-actions bot commented Apr 2, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

2 participants