Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add min p arg to server #911

Closed
ArtyomZemlyak opened this issue Nov 15, 2023 · 10 comments · Fixed by #921
Closed

Add min p arg to server #911

ArtyomZemlyak opened this issue Nov 15, 2023 · 10 comments · Fixed by #921
Labels
enhancement New feature or request

Comments

@ArtyomZemlyak
Copy link

ArtyomZemlyak commented Nov 15, 2023

Add min p arg to server
Related to :
ggerganov/llama.cpp#3841

@iandennismiller
Copy link

iandennismiller commented Nov 15, 2023

It looks like min p was added to llama-cpp-python recently:

@LorenzoBoccaccia
Copy link

I think it's missing from llama.py sampling

@abetlen abetlen added the enhancement New feature or request label Nov 15, 2023
@ddh0
Copy link
Contributor

ddh0 commented Nov 15, 2023

Second this! It looks very promising and I'm excited.

@ddh0
Copy link
Contributor

ddh0 commented Nov 16, 2023

Pinging @abetlen, I finished this. See here: ddh0@e8e05bb

Apologies if this isn't the right way to submit a fix, I'm new to github. But I've tested it and it's working as expected. Let me know if I can help in any way.

tk-master added a commit to tk-master/llama-cpp-python that referenced this issue Nov 16, 2023
My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen#911
@tk-master
Copy link
Contributor

tk-master commented Nov 16, 2023

@ddh0 bad timing, I just made this pr :P #921

Testing would be appreciated.

@LorenzoBoccaccia
Copy link

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.

still I think it should be disabled by default for backcompat

abetlen pushed a commit that referenced this issue Nov 21, 2023
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: #911

* Fix for negative temp (sample_softmax)
@m-from-space
Copy link

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.

still I think it should be disabled by default for backcompat

Hey there, I just saw that this is now implemented. How exactly do I activate min_p sampling when calling the model? Is it activated by default?

@tk-master
Copy link
Contributor

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.
still I think it should be disabled by default for backcompat

Hey there, I just saw that this is now implemented. How exactly do I activate min_p sampling when calling the model? Is it activated by default?

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

@m-from-space
Copy link

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

Thank you. So in which order are sampling methods like min_p, top_k and top_p executed? Can I change that order?

@tk-master
Copy link
Contributor

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

Thank you. So in which order are sampling methods like min_p, top_k and top_p executed? Can I change that order?

take a look at these lines here..

self._ctx.sample_top_k(candidates=self._candidates, k=top_k, min_keep=1)
self._ctx.sample_tail_free(candidates=self._candidates, z=tfs_z, min_keep=1)
self._ctx.sample_typical(
candidates=self._candidates, p=typical_p, min_keep=1
)
self._ctx.sample_top_p(candidates=self._candidates, p=top_p, min_keep=1)
self._ctx.sample_min_p(candidates=self._candidates, p=min_p, min_keep=1)
self._ctx.sample_temp(candidates=self._candidates, temp=temp)
id = self._ctx.sample_token(candidates=self._candidates)

The order is hardcoded at the moment.. but in my opinion you don't really want or need mixing min_p with top_p and top_k samplers, it's probably better to just disable top_p (top_p=1) etc and just use min_p and temp.
But if you wanna change the order and play with it, just edit that file.. it's simple.

francomattar added a commit to francomattar/Python-llama-cpp that referenced this issue Dec 13, 2024
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen/llama-cpp-python#911

* Fix for negative temp (sample_softmax)
nayanzin33sergey added a commit to nayanzin33sergey/Python-llama-cpp that referenced this issue Dec 18, 2024
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen/llama-cpp-python#911

* Fix for negative temp (sample_softmax)
jamesdev9 pushed a commit to jamesdev9/python-llama-cpp that referenced this issue Dec 22, 2024
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen/llama-cpp-python#911

* Fix for negative temp (sample_softmax)
NoBrainer242 pushed a commit to NoBrainer242/Python-App-llama that referenced this issue Dec 27, 2024
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen/llama-cpp-python#911

* Fix for negative temp (sample_softmax)
NoBrainer24 added a commit to NoBrainer24/Python-App-llama- that referenced this issue Dec 28, 2024
* Added support for min_p

My small contribution to this great project.

Ref: ggerganov/llama.cpp#3841

Closes: abetlen/llama-cpp-python#911

* Fix for negative temp (sample_softmax)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants