server: recieving <|im_end|> in all responses of llama 3 #6873

infozzdatalabs · 2024-04-24T09:52:54Z

In all responses calling "/chat/completions" returns at the end the '<|im_end|>'.

I'm using the latest docker version for cuda: 'ghcr.io/ggerganov/llama.cpp:server-cuda'

Thanks in advance.

infozzdatalabs · 2024-04-24T09:56:48Z

Sorry, I didn't see that an issue was already open.

ggerganov · 2024-04-24T10:16:11Z

Fixed in #6860

infozzdatalabs added the bug-unconfirmed label Apr 24, 2024

slavonnet mentioned this issue Apr 24, 2024

OpenAI-Compatible Chat Completions API Endpoint Responses include EOS / stop tokens #6859

Closed

ggerganov closed this as completed Apr 24, 2024

Provide feedback