-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. #25524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We continue our previous discussion here following #25370 Split the encode task into two tasks: token_embed and token_classify
For online scenarios (/pooling):
|
|
I think for online API, the user should be able to pass the task in the request. |
|
Documentation preview: https://vllm--25524.org.readthedocs.build/en/25524/ |
18b0002 to
b6b2e12
Compare
Signed-off-by: wang.yuqi <noooop@126.com>
976793c to
d98bf46
Compare
Signed-off-by: wang.yuqi <noooop@126.com>
e326cde to
dd06fe1
Compare
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
|
/gemini review |
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the pooling API by splitting the encode task into more specific tasks (token_embed, token_classify) and renaming the activation parameter to use_activation for clarity. The documentation, examples, and tests are updated to reflect these changes. The changes are generally well-executed and improve the API's usability. However, I've identified a critical issue in the task inference logic for the new generic /pooling endpoint that could lead to unexpected failures. My review includes a suggestion to fix this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since activation was user facing, we should probably deprecate it when changing it to use_activation
Anyway, I’d like to merge this PR quickly before the release. It’s mainly documentation changes. This feature has been a bit of a mess, and it’s been changing with every recent release. It should finally be stable now after this latest change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @hmellor , we should still keep the old field with a deprecation
|
Otherwise we would break back-compatibility, which is definitely not what a "doc PR" should do |
Signed-off-by: wang.yuqi <noooop@126.com>
|
Let’s land this. |
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
TL;DR
Improve all pooling task (0.11.1 cut)
These PRs are mostly conflicting with each other, so combining them into a series would better inform reviewers about what happened. And what else needs to be done after that?
Purpose
Following #25370
Address: #27413 (comment)
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.