Add min p arg to server #911

ArtyomZemlyak · 2023-11-15T08:13:49Z

Add min p arg to server
Related to :
ggml-org/llama.cpp#3841

The text was updated successfully, but these errors were encountered:

iandennismiller · 2023-11-15T14:28:43Z

It looks like min p was added to llama-cpp-python recently:

LorenzoBoccaccia · 2023-11-15T16:03:33Z

I think it's missing from llama.py sampling

ddh0 · 2023-11-15T19:00:40Z

Second this! It looks very promising and I'm excited.

ddh0 · 2023-11-16T20:45:58Z

Pinging @abetlen, I finished this. See here: ddh0@e8e05bb

Apologies if this isn't the right way to submit a fix, I'm new to github. But I've tested it and it's working as expected. Let me know if I can help in any way.

My small contribution to this great project. Ref: ggml-org/llama.cpp#3841 Closes: abetlen#911

tk-master · 2023-11-16T20:55:51Z

@ddh0 bad timing, I just made this pr :P #921

Testing would be appreciated.

LorenzoBoccaccia · 2023-11-20T17:28:01Z

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.

still I think it should be disabled by default for backcompat

* Added support for min_p My small contribution to this great project. Ref: ggml-org/llama.cpp#3841 Closes: #911 * Fix for negative temp (sample_softmax)

m-from-space · 2023-11-21T19:15:28Z

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.

still I think it should be disabled by default for backcompat

Hey there, I just saw that this is now implemented. How exactly do I activate min_p sampling when calling the model? Is it activated by default?

tk-master · 2023-11-21T19:49:28Z

tested at temperature 1.6 and 3.0 the assistant remains creative and repetition issues are gone even with a lower repetition penalty. at 3.0 seems to be too creative and basically ignore facts passed in the context, but remains coherent and inteligible.
still I think it should be disabled by default for backcompat

Hey there, I just saw that this is now implemented. How exactly do I activate min_p sampling when calling the model? Is it activated by default?

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

m-from-space · 2023-11-25T11:53:19Z

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

Thank you. So in which order are sampling methods like min_p, top_k and top_p executed? Can I change that order?

tk-master · 2023-11-25T12:17:46Z

the same way you'd change the temperature, top_k etc.. it is activated with a small value by default

Thank you. So in which order are sampling methods like min_p, top_k and top_p executed? Can I change that order?

take a look at these lines here..

llama-cpp-python/llama_cpp/llama.py

Lines 1160 to 1168 in 4026166

    
           self._ctx.sample_top_k(candidates=self._candidates, k=top_k, min_keep=1) 
        
           self._ctx.sample_tail_free(candidates=self._candidates, z=tfs_z, min_keep=1) 
        
           self._ctx.sample_typical( 
        
               candidates=self._candidates, p=typical_p, min_keep=1 
        
           ) 
        
           self._ctx.sample_top_p(candidates=self._candidates, p=top_p, min_keep=1) 
        
           self._ctx.sample_min_p(candidates=self._candidates, p=min_p, min_keep=1) 
        
           self._ctx.sample_temp(candidates=self._candidates, temp=temp) 
        
           id = self._ctx.sample_token(candidates=self._candidates)

The order is hardcoded at the moment.. but in my opinion you don't really want or need mixing min_p with top_p and top_k samplers, it's probably better to just disable top_p (top_p=1) etc and just use min_p and temp.
But if you wanna change the order and play with it, just edit that file.. it's simple.

* Added support for min_p My small contribution to this great project. Ref: ggml-org/llama.cpp#3841 Closes: abetlen/llama-cpp-python#911 * Fix for negative temp (sample_softmax)

abetlen added the enhancement New feature or request label Nov 15, 2023

tk-master added a commit to tk-master/llama-cpp-python that referenced this issue Nov 16, 2023

Added support for min_p

1b1a918

My small contribution to this great project. Ref: ggml-org/llama.cpp#3841 Closes: abetlen#911

tk-master mentioned this issue Nov 16, 2023

Added support for min_p #921

Merged

abetlen closed this as completed in #921 Nov 21, 2023

abetlen pushed a commit that referenced this issue Nov 21, 2023

Added support for min_p (#921)

b8438f7

* Added support for min_p My small contribution to this great project. Ref: ggml-org/llama.cpp#3841 Closes: #911 * Fix for negative temp (sample_softmax)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add min p arg to server #911

Add min p arg to server #911

ArtyomZemlyak commented Nov 15, 2023 •

edited

Loading

iandennismiller commented Nov 15, 2023 •

edited

Loading

LorenzoBoccaccia commented Nov 15, 2023

ddh0 commented Nov 15, 2023

ddh0 commented Nov 16, 2023

tk-master commented Nov 16, 2023 •

edited

Loading

LorenzoBoccaccia commented Nov 20, 2023

m-from-space commented Nov 21, 2023

tk-master commented Nov 21, 2023

m-from-space commented Nov 25, 2023

tk-master commented Nov 25, 2023

Add min p arg to server #911

Add min p arg to server #911

Comments

ArtyomZemlyak commented Nov 15, 2023 • edited Loading

iandennismiller commented Nov 15, 2023 • edited Loading

LorenzoBoccaccia commented Nov 15, 2023

ddh0 commented Nov 15, 2023

ddh0 commented Nov 16, 2023

tk-master commented Nov 16, 2023 • edited Loading

LorenzoBoccaccia commented Nov 20, 2023

m-from-space commented Nov 21, 2023

tk-master commented Nov 21, 2023

m-from-space commented Nov 25, 2023

tk-master commented Nov 25, 2023

ArtyomZemlyak commented Nov 15, 2023 •

edited

Loading

iandennismiller commented Nov 15, 2023 •

edited

Loading

tk-master commented Nov 16, 2023 •

edited

Loading