-
-
Couldn't load subscription status.
- Fork 10.8k
[Frontend][4/N] Improve all pooling task | Add plugin pooling task #26973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend][4/N] Improve all pooling task | Add plugin pooling task #26973
Conversation
Signed-off-by: wang.yuqi <noooop@126.com>
|
Documentation preview: https://vllm--26973.org.readthedocs.build/en/26973/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly introduces a new 'plugin' pooling task, which is a valuable addition for models that utilize custom IO processors for pooling, such as Terratorch. The changes are well-implemented across the codebase, including updates to examples and tests. However, I have identified one critical bug in the new DummyPooler implementation that must be fixed.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR.
We need to change at least the name to avoid confusion with the previous encode task. @christian-pinto @maxdebayser @DarkLight1337 After careful consideration, we still need a plugin pooling task, which uses io_processor.parse_request to verify inputs, skipping the verification of PoolingParams. |
Signed-off-by: wang.yuqi <noooop@126.com>
|
|
The request parsing is done in both offline and online mode, it's meant to verify that the input is appropriate for the plugin. I am fine with either methods to be honest. At the moment, only for the online case, the Pooling parameters are created in the I can take care of this change and have the plugin generating/validating pooling parameters. |
sorry, I can't run prithvi_geospatial_mae locally, so please help fix all possible errors in examples and/or all other places . Will it be more convenient if I invite you to collaborate on this PR? The devil is in the details; perhaps when we fix these details, we may discover bigger problems. |
Sure, invite me to this PR. I will be able to work on it starting on Monday though. Right now I have another task that I need to finish |
You're right, sorry I misremembered this (was afk when I sent the previous message so I couldn't do code search to check 😓 ) |
|
I think this extra task makes sense. Should we also add |
IOProcessor uses a data-in, data-out, where the data can be any thing Do we still need a separate extra_kwargs, or can extra_kwargs be part of the data? The user can actually use any data format. |
|
@christian-pinto PTAL #27063 #27066 I hope io_processor_plugins can support binary response, it will definitely be more efficient compared to base64. |
Yes, you're right. |
Compression occurs after post-processing is applied in the I just want to confirm if there are any compatibility issues or what improvements are needed to make the plugin task + binary response more efficient. |
Signed-off-by: wang.yuqi <noooop@126.com>
|
@christian-pinto #27066 (binary response) has been merged. Please help check if it can be used together with plugin task. It would be best to have an example to tell users how to use binary response with prithvi_geospatial_mae . |
…cessor plugin Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
…ngly Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
I have added a new function to the plugins interface that can be used for validating or generating params. There were a few other changes needed to make the tests pass - i.e., the new DummyPooler return one less dimension in the output and the test plugin needed to be changed to account for that. |
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
mypy should be happy now 🤞 |
|
@christian-pinto Thanks for your help Please confirm if it can still run after merging #27204 PTAL:
Please update the documentation for IO Processor Plugins https://docs.vllm.ai/en/latest/design/io_processor_plugins.html |
|
After #27393 gets merged this one will pass fine too. Let me fix the documentation. |
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
The new plugin pooling task looks great. @christian-pinto Thanks for your help! Are there any more modifications needed for this PR? |
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
…llm-project#26973) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Christian Pinto <christian.pinto@ibm.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
…llm-project#26973) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Christian Pinto <christian.pinto@ibm.com>
…llm-project#26973) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Christian Pinto <christian.pinto@ibm.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…llm-project#26973) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Christian Pinto <christian.pinto@ibm.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Improve all pooling task
These PRs are mostly conflicting with each other, so combining them into a series would better inform reviewers about what happened. And what else needs to be done after that?
Purpose
plugin task uses io_processor.parse_request to verify inputs, skipping PoolingParams verify
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.