-
Notifications
You must be signed in to change notification settings - Fork 11.4k
Init prompts are truncated to --batch-size (max 512 tokens) #1403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems to be a misunderstanding of what the batch size means. The prompt isn't truncated, it is just processed in chunks of batch-size tokens. This is meant to improve performance, but it requires more memory, which is why it is limited to 512. |
Removing that break does not interfer with the processing of llama_eval by batches of --batch-size tokens. Currently an initial prompt of more than --batch-size (maxed out at 512 in common.cpp by the way...) give back control to the user before the initial prompt is fully processed. You can reproduce the issue by increasing the init prompt size of the Miku.sh example above 512 tokens or by lowering it's batch-size parameter below the actual init prompt token count. |
The I am not really sure if I understand what issue you are seeing, If you have a reliable way of reproducing it, please post clear step-by-step instructions. |
Yes: configure batch-size of Miku.sh example to 100. You'll get the control back before the end of the initial prompt. |
I am not able to reproduce this issue. |
Ok, I tested it again and I get it now, I just didn't have enough patience : I forgot each batch takes time to process... Thanks for your patience, I'll close the issue which is not one :-) |
Prompt should not have a max size of
--batch-size
.if ((int) embd.size() >= params.n_batch) { break; }
seems to have no particular usage and should be removed to allow prompting more than 512 tokens.A side effect is that with the current version, initial prompts are truncated to 512 tokens no matter the
--ctx_size
or the--batch-size
parameter.The text was updated successfully, but these errors were encountered: