Skip to content

Misc. bug: Tokens in top_probs / top_logprobs are missing whitespace #11728

Closed
@CyberShadow

Description

@CyberShadow

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090 Ti, compute capability 8.6, VMM: yes
version: 0 (unknown)
built with gcc (GCC) 13.3.0 for x86_64-unknown-linux-gnu

(actually version 4552, built with Nix)

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

$ bin/llama-server -m ../../../wizardcoder-python-34b-v1.0.Q5_K_M.gguf -ngl 9999
...

$ curl -fsS \
    --url http://127.0.0.1:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Hello","n_predict": 1, "n_probs": 10, "temperature":0}' | jq .
{
  ...
  "completion_probabilities": [
    {
      "id": 2897,
      "token": " os",                      <---------- whitespace OK
      "bytes": [
        32,                                <---------- whitespace OK
        111,
        115
      ],
      "logprob": -2.0750603675842285,
      "top_logprobs": [
        {
          "id": 2897,
          "token": "os",                   <---------- whitespace missing
          "bytes": [
            111,                           <---------- whitespace missing
            115
          ],
          "logprob": -2.0750603675842285
        },

Problem description & steps to reproduce

As above, doesn't seem to depend on the model

First Bad Commit

57bb2c4

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions