-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[User] interactive mode with --multiline-input cannot input more than around 13000 byte words by one time. #2259
Comments
The thing is, the model will eval the first ~13000 bytes words automatically even I did not return control to it. Then the model is going to extend that 13000 bytes words. During that, using Control + C interject the processing, one of the rest part of words (out of ~13000byte) then be automatically send to the model.(another token length around 13000 bytes texts) And it's evaluating, another control C needed ...till the whole paste text be processed. |
Could you explain that again? When the length is greater than 13000 bytes there's an issue? What does it do? Are you on Windows or a POSIX system such as Linux or MacOs? There's no fixed length buffers in the input code. 13000 bytes is however around 3.2k tokens, does your model support that context size? |
I don't want to use 20k context length by just Chatting with my model😂 |
I guess we were typing at the same time but my typing is a bit earlier so I got the 2nd floor 😂
Yep, as I said, the model is going to extend all those 3.2 token parts... |
I'm still having trouble understanding what's happening. (You can talk to me in your native language if it helps you explain.) Are you saying it's evaluating the text before you return control? That would suggest to me that one of the lines of text ends with |
Other's have mentioned potential issues with |
Your understanding is right. I tried again with an article in wired Still not working. This time no "/ " or "\ " at all. It's working! Sorry for any inconvenience! |
Sorry again 😅 the article from wired did work. |
I see one |
Ah, ok. The Right now the best thing to do is to put it all in one file and feed it in as the prompt with |
I didn't remove it, but it works fine😂 only \ will lead multi line input return the control in llama |
Yeah, in that case it's not at the end of the line. I think sometimes mathematical formulas tend to split things into new lines. Which would be causing the issue on the arxiv paper. |
You are right, in the paper I mention above there was a "/ " appeared before the stuck point. |
And it stuck in "/" But as the instruction in mútil line input said, return control to llama using " \ " ! That means the bug still exist! |
With multiline mode, both |
Oops, you are right! Thanks for pointing out my mistakes and the patient explaining! |
A (paying) project has taken me away from my work here, but I hope to return to it soon. My last pull requests added a I just hate to add more command-line options. Perhaps an |
that's nice, I definitely gonna try it as soon as it be possible!(btw if u didn't mention yet, recently llama.cpp merge the NTK methods #2054 for Rope scaling now the basic llama without finetuning can handle 8k context or even more!) |
I found out this problem because I tried to let model help me read the arxiv paper!
(Linux 5.19 Ubuntu, 16k fineturn model + rope scaling)
The text was updated successfully, but these errors were encountered: