n_probs doesnot return completion_probabilities #4088

maxjust · 2023-11-15T11:57:51Z

Hi, I try to get the token probabilities with latest code from main branch, compiled with cmake under linux, during compilation had some warnings (not imporant), but after run server binary and infer request, got empty complation_probabilites field.

Request:

curl --request POST \
    --url http://192.168.41.197:8081/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Some stories about railway station","n_predict": 256, "n_probs" : 3}'

Response:

{"completion_probabilities":[],"content":" Rail statins ........ some content here","generation_settings":{"frequency_penalty":0.0,"grammar":"","ignore_eos":false,"logit_bias":[],"min_p":0.05000000074505806,"mirostat":0,"mirostat_eta":0.10000000149011612,"mirostat_tau":5.0,"model":"/home/max/DISK2/llama13b_200K/ggml-model-f16.gguf","n_ctx":512,"n_keep":0,"n_predict":256,"n_probs":3,"penalize_nl":true,"presence_penalty":0.0,"repeat_last_n":64,"repeat_penalty":1.100000023841858,"seed":4294967295,"stop":[],"stream":false,"temp":0.800000011920929,"tfs_z":1.0,"top_k":40,"top_p":0.949999988079071,"typical_p":1.0},"model":"/home/max/DISK2/llama13b_200K/ggml-model-f16.gguf","prompt":"Some stories about railway station","slot_id":0,"stop":true,"stopped_eos":true,"stopped_limit":false,"stopped_word":false,"stopping_word":"","timings":{"predicted_ms":3258.927,"predicted_n":55,"predicted_per_second":16.87672046658302,"predicted_per_token_ms":59.253218181818184,"prompt_ms":429.54,"prompt_n":20,"prompt_per_second":46.561437817199796,"prompt_per_token_ms":21.477},"tokens_cached":75,"tokens_evaluated":20,"tokens_predicted":55,"truncated":false}

Where I have an mistake? or it is a bug?

m-a-sch · 2023-11-23T06:44:18Z

I noticed the same when stream is set to false.

ibehnam · 2023-12-30T22:40:38Z

Can we please ask the devs to take a look into this? @ggerganov

ggerganov · 2023-12-31T08:24:11Z

Can you test #4714 ?

ibehnam · 2024-01-04T17:32:59Z

Can you test #4714 ?

I tested it right now (after updating llama.cpp to the latest commit) and it still doesn't show the logprobabilities. I simply get:

{
  "completion_probabilities": [],
  "content": " to find fulfillment and happiness.\nWhat do you think the meaning of life",
  "generation_settings": {
    "frequency_penalty": 0.0,
    "grammar": "",
    "ignore_eos": false,
    "logit_bias": [],
    "min_p": 0.05000000074505806,
    "mirostat": 0,
    "mirostat_eta": 0.10000000149011612,
    "mirostat_tau": 5.0,
    "model": "/Users/behnam/Downloads/LLM/models/llama-2-7b-chat.Q8_0.gguf",
    "n_ctx": 2048,
    "n_keep": 0,
    "n_predict": 16,
    "n_probs": 3,
    "penalize_nl": true,
    "penalty_prompt_tokens": [],
    "presence_penalty": 0.0,
    "repeat_last_n": 64,
    "repeat_penalty": 1.100000023841858,
    "seed": 4294967295,
    "stop": [],
    "stream": false,
    "temperature": 0.800000011920929,
    "tfs_z": 1.0,
    "top_k": 40,
    "top_p": 0.949999988079071,
    "typical_p": 1.0,
    "use_penalty_prompt_tokens": false
  },
  "model": "/Users/behnam/Downloads/LLM/models/llama-2-7b-chat.Q8_0.gguf",
  "prompt": "I believe the meaning of life is",
  "slot_id": 0,
  "stop": true,
  "stopped_eos": false,
  "stopped_limit": true,
  "stopped_word": false,
  "stopping_word": "",
  "timings": {
    "predicted_ms": 762.49,
    "predicted_n": 16,
    "predicted_per_second": 20.983881755826307,
    "predicted_per_token_ms": 47.655625,
    "prompt_ms": 422.051,
    "prompt_n": 8,
    "prompt_per_second": 18.955055194751345,
    "prompt_per_token_ms": 52.756375
  },
  "tokens_cached": 24,
  "tokens_evaluated": 8,
  "tokens_predicted": 16,
  "truncated": false
}

ggerganov · 2024-01-04T17:38:36Z

You need to checkout #4714 - it's not merged yet

ibehnam · 2024-01-04T17:42:10Z

@ggerganov I just tested llama.cpp with that commit and can confirm that log probabilities show up now:

  "completion_probabilities": [
    {
      "content": " to",
      "probs": [
        {
          "prob": 0.9832262396812439,
          "tok_str": " to"
        },
        {
          "prob": 0.01389230228960514,
          "tok_str": " different"
        },
        {
          "prob": 0.00288143171928823,
          "tok_str": " something"
        }
      ]
    },
    {
      "content": " find",
      "probs": [
        {
          "prob": 0.9647254347801208,
          "tok_str": " find"
        },
        {
          "prob": 0.02691579796373844,
          "tok_str": " be"
        },
        {
          "prob": 0.008358730003237724,
          "tok_str": " seek"
        }
      ]
    },
    {
      "content": " purpose",
      "probs": [
        {
          "prob": 0.45737704634666443,
          "tok_str": " purpose"
        },
        {
          "prob": 0.2791227698326111,
          "tok_str": " happiness"
        },
        {
          "prob": 0.1786845624446869,
          "tok_str": " ful"
        }
      ]
    },
    {
      "content": ",",
      "probs": [
        {
          "prob": 0.5004114508628845,
          "tok_str": " and"
        },
        {
          "prob": 0.4949476420879364,
          "tok_str": ","
        },
        {
          "prob": 0.0046408711932599545,
          "tok_str": "."
        }
      ]
    },
    {
      "content": " happiness",
      "probs": [
        {
          "prob": 0.8342463374137878,
          "tok_str": " happiness"
        },
        {
          "prob": 0.14888985455036163,
          "tok_str": " ful"
        },
        {
          "prob": 0.01686379872262478,
          "tok_str": " to"
        }
      ]
    },
    {
      "content": " and",
      "probs": [
        {
          "prob": 0.6213716864585876,
          "tok_str": " and"
        },
        {
          "prob": 0.3780105412006378,
          "tok_str": ","
        },
        {
          "prob": 0.0006177971372380853,
          "tok_str": " &"
        }
      ]
    },
    {
      "content": " ful",
      "probs": [
        {
          "prob": 0.9998679161071777,
          "tok_str": " ful"
        },
        {
          "prob": 0.00007403634663205594,
          "tok_str": " content"
        },
        {
          "prob": 0.00005809162757941522,
          "tok_str": " to"
        }
      ]
    },
    {
      "content": "fill",
      "probs": [
        {
          "prob": 0.9876359105110168,
          "tok_str": "fill"
        },
        {
          "prob": 0.012267238460481167,
          "tok_str": "fil"
        },
        {
          "prob": 0.00009692557796370238,
          "tok_str": "filled"
        }
      ]
    },
    {
      "content": "ment",
      "probs": [
        {
          "prob": 0.9999960660934448,
          "tok_str": "ment"
        },
        {
          "prob": 0.0000025447493499086704,
          "tok_str": "ement"
        },
        {
          "prob": 0.0000013955876738691586,
          "tok_str": "l"
        }
      ]
    },
    {
      "content": ".",
      "probs": [
        {
          "prob": 0.9495798945426941,
          "tok_str": "."
        },
        {
          "prob": 0.03004053235054016,
          "tok_str": " in"
        },
        {
          "prob": 0.02037946879863739,
          "tok_str": " through"
        }
      ]
    },
    {
      "content": " Here",
      "probs": [
        {
          "prob": 0.6204178333282471,
          "tok_str": " Here"
        },
        {
          "prob": 0.2490677386522293,
          "tok_str": " It"
        },
        {
          "prob": 0.06325455754995346,
          "tok_str": "\n"
        }
      ]
    },
    {
      "content": " are",
      "probs": [
        {
          "prob": 0.9843857884407043,
          "tok_str": " are"
        },
        {
          "prob": 0.00995764322578907,
          "tok_str": " is"
        },
        {
          "prob": 0.00565652409568429,
          "tok_str": "'"
        }
      ]
    },
    {
      "content": " some",
      "probs": [
        {
          "prob": 0.9433659315109253,
          "tok_str": " some"
        },
        {
          "prob": 0.02906605787575245,
          "tok_str": " a"
        },
        {
          "prob": 0.027568047866225243,
          "tok_str": " "
        }
      ]
    },
    {
      "content": " of",
      "probs": [
        {
          "prob": 0.46796277165412903,
          "tok_str": " of"
        },
        {
          "prob": 0.42744186520576477,
          "tok_str": " reasons"
        },
        {
          "prob": 0.043556634336709976,
          "tok_str": " ways"
        }
      ]
    },
    {
      "content": " the",
      "probs": [
        {
          "prob": 0.9840854406356812,
          "tok_str": " the"
        },
        {
          "prob": 0.015881765633821487,
          "tok_str": " my"
        },
        {
          "prob": 0.00003282620309619233,
          "tok_str": " reasons"
        }
      ]
    },
    {
      "content": " reasons",
      "probs": [
        {
          "prob": 0.6720941662788391,
          "tok_str": " reasons"
        },
        {
          "prob": 0.15267716348171234,
          "tok_str": " ways"
        },
        {
          "prob": 0.12647970020771027,
          "tok_str": " things"
        }
      ]
    },
    {
      "content": " why",
      "probs": [
        {
          "prob": 0.9955596923828125,
          "tok_str": " why"
        },
        {
          "prob": 0.0036945336032658815,
          "tok_str": " I"
        },
        {
          "prob": 0.0007458277978003025,
          "tok_str": ":"
        }
      ]
    }
  ],
  "content": " to find purpose, happiness and fulfillment. Here are some of the reasons why",
  "generation_settings": {
    "frequency_penalty": 0.0,
    "grammar": "",
    "ignore_eos": false,
    "logit_bias": [],
    "min_p": 0.05000000074505806,
    "mirostat": 0,
    "mirostat_eta": 0.10000000149011612,
    "mirostat_tau": 5.0,
    "model": "/Users/behnam/Downloads/LLM/models/llama-2-7b-chat.Q8_0.gguf",
    "n_ctx": 2048,
    "n_keep": 0,
    "n_predict": 16,
    "n_probs": 3,
    "penalize_nl": true,
    "penalty_prompt_tokens": [],
    "presence_penalty": 0.0,
    "repeat_last_n": 64,
    "repeat_penalty": 1.100000023841858,
    "seed": 4294967295,
    "stop": [],
    "stream": false,
    "temperature": 0.800000011920929,
    "tfs_z": 1.0,
    "top_k": 40,
    "top_p": 0.949999988079071,
    "typical_p": 1.0,
    "use_penalty_prompt_tokens": false
  },
  "model": "/Users/behnam/Downloads/LLM/models/llama-2-7b-chat.Q8_0.gguf",
  "prompt": "I believe the meaning of life is",
  "slot_id": 0,
  "stop": true,
  "stopped_eos": false,
  "stopped_limit": true,
  "stopped_word": false,
  "stopping_word": "",
  "timings": {
    "predicted_ms": 764.836,
    "predicted_n": 16,
    "predicted_per_second": 20.919517386733887,
    "predicted_per_token_ms": 47.80225,
    "prompt_ms": 436.262,
    "prompt_n": 8,
    "prompt_per_second": 18.33760446704045,
    "prompt_per_token_ms": 54.53275
  },
  "tokens_cached": 24,
  "tokens_evaluated": 8,
  "tokens_predicted": 16,
  "truncated": false
}

ggerganov mentioned this issue Dec 31, 2023

server : send token probs for "stream == false" #4714

Merged

ggerganov closed this as completed in #4714 Jan 4, 2024

dan-homebrew mentioned this issue Sep 11, 2024

feat: add llamacpp params janhq/cortex.llamacpp#221

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n_probs doesnot return completion_probabilities #4088

n_probs doesnot return completion_probabilities #4088

maxjust commented Nov 15, 2023

m-a-sch commented Nov 23, 2023

ibehnam commented Dec 30, 2023 •

edited

Loading

ggerganov commented Dec 31, 2023

ibehnam commented Jan 4, 2024

ggerganov commented Jan 4, 2024

ibehnam commented Jan 4, 2024

n_probs doesnot return completion_probabilities #4088

n_probs doesnot return completion_probabilities #4088

Comments

maxjust commented Nov 15, 2023

m-a-sch commented Nov 23, 2023

ibehnam commented Dec 30, 2023 • edited Loading

ggerganov commented Dec 31, 2023

ibehnam commented Jan 4, 2024

ggerganov commented Jan 4, 2024

ibehnam commented Jan 4, 2024

ibehnam commented Dec 30, 2023 •

edited

Loading