Tokenization Example #1193

rozek · 2023-04-26T17:43:16Z

First of all: thank you very much for the continuing work on llama.cpp - I'm using it every day with various models.

For proper context management, however, I often need to know how many tokens prompts and responses contain. There is an "embedding" example, but none for "tokenization".

This is why I made my own (see my own fork of llama.cpp)

It seems to work, but since I am no C++ programmer and, in addition, not a real AI expert, I hesitate to create a pull request.

Perhaps, somebody else may have a look at it or create a better example for the public...

Thanks for all your effort!

SlyEcho · 2023-04-27T15:42:24Z

I think test-tokenizer-0.cpp is a good example of a minimal tokenizer.

github-actions · 2024-04-09T01:09:48Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 9, 2024

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokenization Example #1193

Tokenization Example #1193

rozek commented Apr 26, 2023

SlyEcho commented Apr 27, 2023

github-actions bot commented Apr 9, 2024

Tokenization Example #1193

Tokenization Example #1193

Comments

rozek commented Apr 26, 2023

SlyEcho commented Apr 27, 2023

github-actions bot commented Apr 9, 2024