-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove token functions with context
args in favor of model
#3720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some recollection that we had these functions before, but I decided to remove them for some reason. However, now I don't see what could have been the reason.
Will leave this open for a day or two and if we don't see a problem - we can merge and potentially deprecate the context alternatives
In #3301 I moved the tokenization functions to take a |
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
- `llama_token_get_text` - `llama_token_get_score` - `llama_token_get_type`
llama_model_token_*
variants to all the llama_token_*
functions that takes in model
instead of model_context
context
args in favor of model
context
args in favor of model
context
args in favor of model
after @slaren's suggestions - there are small changes to nearly all the examples and this becomes an API breaking change - is there any extra work that should be done to accommodate that? (doc updates? / release notes? / hot topics?) |
* master: (350 commits) speculative : ensure draft and target model vocab matches (ggerganov#3812) llama : correctly report GGUFv3 format (ggerganov#3818) simple : fix batch handling (ggerganov#3803) cuda : improve text-generation and batched decoding performance (ggerganov#3776) server : do not release slot on image input (ggerganov#3798) batched-bench : print params at start log : disable pid in log filenames server : add parameter -tb N, --threads-batch N (ggerganov#3584) (ggerganov#3768) server : do not block system prompt update (ggerganov#3767) sync : ggml (conv ops + cuda MSVC fixes) (ggerganov#3765) cmake : add missed dependencies (ggerganov#3763) cuda : add batched cuBLAS GEMM for faster attention (ggerganov#3749) Add more tokenizer tests (ggerganov#3742) metal : handle ggml_scale for n%4 != 0 (close ggerganov#3754) Revert "make : add optional CUDA_NATIVE_ARCH (ggerganov#2482)" issues : separate bug and enhancement template + no default title (ggerganov#3748) Update special token handling in conversion scripts for gpt2 derived tokenizers (ggerganov#3746) llama : remove token functions with `context` args in favor of `model` (ggerganov#3720) Fix baichuan convert script not detecing model (ggerganov#3739) make : add optional CUDA_NATIVE_ARCH (ggerganov#2482) ...
changed the following to take in
llama_model
instead ofllama_context
llama_token_get_text
llama_token_get_score
llama_token_get_type
llama_token_bos
llama_token_eos
llama_token_nl
llama_token_prefix
llama_token_middle
llama_token_suffix
llama_token_eot
Special tokens are a property of the model - not the context (which is how it is currently expressed in
llama_token_bos
and co).As a model is always attainable when one has context (via
llama_get_model
) these new variants supersede the old ones.