-
-
Couldn't load subscription status.
- Fork 10.9k
[V0 Deprecation] Remove vllm.worker and update according imports
#25901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V0 Deprecation] Remove vllm.worker and update according imports
#25901
Conversation
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively removes the deprecated vllm.worker directory and migrates its functionality to vllm.v1, successfully deprecating the V0 worker implementation. The changes are clean and well-executed. Key improvements include:
- Consolidating worker base classes into
vllm/v1/worker/worker_base.py. - Removing the insecure and deprecated usage of
cloudpicklefor dynamic worker class loading, replacing it with a clear error for unsupported configurations. - Simplifying the worker selection logic in platform-specific files (
cuda.py,rocm.py) to always use the V1 worker. - Updating all relevant imports across the codebase to point to the new
vllm.v1.workerpath.
The refactoring makes the codebase cleaner, more secure, and easier to maintain by removing the old V0 logic. I have reviewed the changes and found no issues. The implementation is solid.
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
Will wait to trigger CI to make sure there aren't any failures |
| def execute_model( | ||
| self, | ||
| execute_model_req: Optional[ExecuteModelRequest] = None | ||
| ) -> Optional[list[SamplerOutput]]: | ||
| raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this method is also outdated. V1 doesn't use ExecuteModelRequest at all.
…25901) Signed-off-by: yewentao256 <zhyanwentao@126.com>
…llm-project#25901) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
|
@aarnphm better to import the v1 worker here to keep compatibility? people might still import |
|
Are you suggesting a small shim? afaict in v0.11.0 we announce that we completely remove v0? |
Related to: - vllm-project/vllm#25901 - vllm-project/vllm#25345 Now we first try to import `WorkerWrapperBase` from `vllm.worker.worker_base`, if we have an error, we append `v1` there. For `compute_logits` patch, we can just remove the import of `SamplingMetadata`, create a wrapper that accepts any arguments with *args, **kwargs, and pass them through to the original method, so that it can be more flexible and future-proof. Signed-off-by: Hollow Man <hollowman@opensuse.org>
Related to: - vllm-project/vllm#25901 - vllm-project/vllm#25345 Now we first try to import `WorkerWrapperBase` from `vllm.worker.worker_base`, if we have an error, we append `v1` there. For `compute_logits` patch, we can just remove the import of `SamplingMetadata`, create a wrapper that accepts any arguments with *args, **kwargs, and pass them through to the original method, so that it can be more flexible and future-proof. Signed-off-by: Hollow Man <hollowman@opensuse.org>
### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. Related to: - vllm-project/vllm#25901 - vllm-project/vllm#25345 Now we first try to import `WorkerWrapperBase` from `vllm.worker.worker_base`, if we have an error, we append `v1` there. For `compute_logits` patch, we can just remove the import of `SamplingMetadata`, create a wrapper that accepts any arguments with *args, **kwargs, and pass them through to the original method, so that it can be more flexible and future-proof. ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) Signed-off-by: Hollow Man <hollowman@opensuse.org>
…llm-project#25901) Signed-off-by: xuebwang-amd <xuebwang@amd.com>
### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. Related to: - vllm-project/vllm#25901 - vllm-project/vllm#25345 Now we first try to import `WorkerWrapperBase` from `vllm.worker.worker_base`, if we have an error, we append `v1` there. For `compute_logits` patch, we can just remove the import of `SamplingMetadata`, create a wrapper that accepts any arguments with *args, **kwargs, and pass them through to the original method, so that it can be more flexible and future-proof. ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) Signed-off-by: Hollow Man <hollowman@opensuse.org>
…llm-project#25901) Signed-off-by: xuebwang-amd <xuebwang@amd.com>
This PR removes the
vllm/workerdirectory and migrate relevant class tovllm/v1Some change behaviours include:
init_worker(cc @russellb)I haven't touched the ray executor part yet.
Signed-off-by: Aaron Pham contact@aarnphm.xyz