Skip to content

Server completion streaming returns special tokens as empty strings in chunks #7106

Closed
@Inego

Description

@Inego

Version: b2794.
Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf (updated)
Prompt: "<|start_header_id|>user<|end_header_id|>How much is 12 plus 19?<|eot_id|>"

When I run the server and send a completion request with streaming, in the verbose logs I see that the server generates the "<|start_header_id|>", "assistant" and "<|end_header_id|>", followed by "\n\n12 + 19 = 31".

However, the streaming chunks sent by server for <|start_header_id|> and <|end_header_id|> have empty strings as content in data.

I couldn't find a config parameter either in the server or in the request that could change this behavior.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions