Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Min P sampler implementation [alternative to Top P/Top K] #3841
Min P sampler implementation [alternative to Top P/Top K] #3841
Changes from 23 commits
59d1232
52af782
16b60dd
a3c2843
4c6744b
a4e15a3
49af767
a9e2b74
a235a0d
838d58d
69ef4ca
833637b
62fc771
49b68e8
6f7cdec
cb23358
fcbbfc1
69e638e
3ddfd67
18c0aa7
87adfad
9248325
512cac6
974640a
3b58af2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that min_p comes after all samplers, meaning that sampling would be among
top_k->tfs->typical->top_p
results, or, as we have enabled almost every sampler, amongtop_k->top_p
results. So, it is among Top K=40, among them Top P 95%, and only then Min PThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That order seems logical. Min-P is supposed to be a Top-P replacement, but if you were to use it together with other samplers it would fulfill the same role as Top-P and should have the same order. And the default being 0 makes sense since Top-P already has a default value, you don't want to combine both actively at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it probably shouldn't just be enabled by default. I don't think it was like that when I looked at it, but maybe I missed it.
Maybe add a
--disable-all-samplers
option or something? I think the default settings are aimed at generally producing decent results, probably not the case with all samplers disabled.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? To my understanding, it wouldn't be misleading to apply a Min-P on the top logits that combined make a certain procentage since Min-P takes into consideration the procentage of the first (highest probability) logit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does seem like the default is currently 0.05, which probably should be changed to 0.