-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama : allow gguf RoPE keys to be overridden with defaults #3240
Conversation
d863550
to
76988cd
Compare
I have an older model codellama-34b.Q6_K.gguf, and it outputs garbage after commit a5661d7. Why would that be? |
Well, codellama-34b does use a different rope_freq_base, so maybe this commit did somehow break it. What is the sha256sum of your model file? There are two versions. |
1002ef07afb15d208819c844f9b0e28b98b3c0c8da3df54521e727c17b45917c |
I'm still looking into this, but there's something really odd about that model file - the strings for the keys all have an embedded NUL byte at the end, which is counted towards their length. I don't know of any version of the llama.cpp GGUF writer that did that. |
@shibe2 Do you observe the issue with a newly created converted model with the latest version of the code? |
LlongOrca-7B-16k works with default parameters on 569550d. I converted it again just now. Output on a short prompt matches exactly the output of my previous conversion. Also, information retrieval from a long text works. |
@klosax is this what you had in mind? It seems like the simplest way to resolve your TODO comment. I'd like to find a solution to this before I add more RoPE keys in PR #2268.
This way, the fallback value is hardcoded in llm_load_hparams and not exposed anywhere else. This shouldn't matter, because with GGUF the API user can't assume the value of these parameters before they've loaded the model.