-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add min p arg to server #911
Comments
It looks like |
I think it's missing from llama.py sampling |
Second this! It looks very promising and I'm excited. |
Pinging @abetlen, I finished this. See here: ddh0@e8e05bb Apologies if this isn't the right way to submit a fix, I'm new to github. But I've tested it and it's working as expected. Let me know if I can help in any way. |
My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen#911
tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible. still I think it should be disabled by default for backcompat |
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: #911 * Fix for negative temp (sample_softmax)
Hey there, I just saw that this is now implemented. How exactly do I activate |
the same way you'd change the temperature, top_k etc.. it is activated with a small value by default |
Thank you. So in which order are sampling methods like |
take a look at these lines here.. llama-cpp-python/llama_cpp/llama.py Lines 1160 to 1168 in 4026166
The order is hardcoded at the moment.. but in my opinion you don't really want or need mixing min_p with top_p and top_k samplers, it's probably better to just disable top_p (top_p=1) etc and just use min_p and temp. But if you wanna change the order and play with it, just edit that file.. it's simple. |
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)
* Added support for min_p My small contribution to this great project. Ref: ggerganov/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)
Add
min p
arg to serverRelated to :
ggerganov/llama.cpp#3841
The text was updated successfully, but these errors were encountered: