-
-
Couldn't load subscription status.
- Fork 10.8k
[Model][2/N] Improve all pooling task | Support multi-vector retrieval #25370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
try: Do you ok with this api and outputs? There are still some broken features that need to be fixed. But the multi_vector feature is now testable. |
|
Ready for review (Slight) Break change
def encode2pooling_task(supported_tasks):
# Currently no model supports both token_embed and token_classify.
if "token_embed" in supported_tasks:
return "token_embed"
elif "token_classify" in supported_tasks:
return "token_classify"
else:
raise ValueError(f"pooling_task must be one of {supported_tasks}.")
|
|
Hi @noooop, I just tested and it works fine. The only thing it was missing is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this page https://docs.vllm.ai/en/latest/models/pooling_models.html#model-conversion to not use encode task anymore
|
I'll delay the merge of this PR until after the release so we don't have to worry about back-compatibility issues which further complicate future PRs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the changes in this PR look good to me but I would prefer to still keep the generic "encode" task around for uses cases that don't cleanly fit with the old and new tasks introduced in this PR. As an example, the use case introduced by this PR: #22820 can't really be described as "token_embed", "token_classify", "embed", "classify" or "score".
|
This pull request has merge conflicts that must be resolved before it can be |
Now only mypy will check whether pooling_task belongs to PoolingTask, so pooling_task in encode api can accept any str. In fact, we basically support custom pooling task and pooler. As long as user implement a OOT model with pooler, and use the corresponding pooling_task in encode. LOL Do we need to move towards allowing users to use pooling task plugins? |
It seems this pr can pass the test, but I don't know how it happened. |
|
Maybe you can try to pip install the specific version of terratorch |
Signed-off-by: wang.yuqi <noooop@126.com>
Line 57 in b8a4572
It seems I can only rely on ci. |
|
probably this is happening because on CI we use the compiled list of dependencies, rather than manually pip installing the various packages? Have you tried installing all dependencies from here: https://github.com/vllm-project/vllm/blob/main/requirements/test.in We are in the process of releasing a new version of TerraTorch and then we will be able to just pip install from pypy rather than a specific tag from git. I will post a PR to fix that in vLLM soon. |
|
By the way, this PR splits the encode task into two tasks: token_embed and token_classify.
prithvi_mae and test_io_processor_plugins are not one of "token_embed", "token_classify", "embed", "classify" or "score". We no longer have the encode task that puts all the others into it. @christian-pinto How do you think about the plugin pooling task? |
|
Are there any more modifications needed for this PR? |
|
As long as the test passes then I'm fine with it |
|
When it comes to the IO processor plugins, the type of pooling activation function depends on the combination of model and plugin. As an example, for PrithviMAE we instantiate a However, I agree with @maxdebayser regarding keeping the |
(You forgot to approve it |
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: bbartels <benjamin@bartels.dev>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
vllm-project#25370) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Break change
Improve all pooling task
These PRs are mostly conflicting with each other, so combining them into a series would better inform reviewers about what happened. And what else needs to be done after that?
Purpose
After #21227 landed, we hope that pooling models can always use all pooling, and users don’t need to manually set using all pooling.
The current encode api (/pooling api ) mainly targets the classify for each token scenario (e.g. TokenClassification #24872 & reward models), overlooked the embed for each token scenario.
Let's support embed for each token scenario (multi-vector retrieval)
Partial_Fix #25165
We are stepping closer to support ColBERT & ColPali
cc @DarkLight1337 @maxdebayser
Test Plan
tests/models/language/pooling/test_multi_vector_retrieval.py
tests/test_pooling_params.py
Test Result
pass
Known Issues
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.