Skip to content

Conversation

@mgoin
Copy link
Member

@mgoin mgoin commented Apr 4, 2025

Before this PR, running a model with pipeline parallelism with default args results in an error because the default "mp" distributed executor is not supported in V1 for PP.

vllm serve Qwen/Qwen2-1.5B-Instruct -pp 2
...
INFO 04-04 11:36:58 [config.py:1591] Defaulting to use mp for distributed inference
...
ERROR 04-04 11:37:05 [core.py:390] EngineCore hit an exception: Traceback (most recent call last):
ERROR 04-04 11:37:05 [core.py:390]   File "/home/mgoin/code/vllm/vllm/v1/engine/core.py", line 378, in run_engine_core
ERROR 04-04 11:37:05 [core.py:390]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 04-04 11:37:05 [core.py:390]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-04 11:37:05 [core.py:390]   File "/home/mgoin/code/vllm/vllm/v1/engine/core.py", line 319, in __init__
ERROR 04-04 11:37:05 [core.py:390]     super().__init__(vllm_config, executor_class, log_stats)
ERROR 04-04 11:37:05 [core.py:390]   File "/home/mgoin/code/vllm/vllm/v1/engine/core.py", line 67, in __init__
ERROR 04-04 11:37:05 [core.py:390]     self.model_executor = executor_class(vllm_config)
ERROR 04-04 11:37:05 [core.py:390]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-04 11:37:05 [core.py:390]   File "/home/mgoin/code/vllm/vllm/executor/executor_base.py", line 52, in __init__
ERROR 04-04 11:37:05 [core.py:390]     self._init_executor()
ERROR 04-04 11:37:05 [core.py:390]   File "/home/mgoin/code/vllm/vllm/v1/executor/multiproc_executor.py", line 61, in _init_executor
ERROR 04-04 11:37:05 [core.py:390]     assert self.world_size == tensor_parallel_size, (
ERROR 04-04 11:37:05 [core.py:390]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-04 11:37:05 [core.py:390] AssertionError: world_size (2) must be equal to the tensor_parallel_size (1). Pipeline parallelism is not yet implemented in v1

Note that if you do manually specify "ray" as the executor, it will work find in V1:

vllm serve Qwen/Qwen2-1.5B-Instruct -pp 2 --distributed_executor_backend ray

With this PR, now that default usage will fallback to V0 so we can respect the distributed executor

vllm serve Qwen/Qwen2-1.5B-Instruct -pp 2

WARNING 04-04 11:38:05 [arg_utils.py:1710] Pipeline Parallelism without Ray distributed executor is not supported by the V1 Engine. Falling back to V0. 
INFO 04-04 11:38:05 [config.py:1591] Defaulting to use mp for distributed inference

Signed-off-by: mgoin <mgoin64@gmail.com>
@github-actions
Copy link

github-actions bot commented Apr 4, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mgoin mgoin requested a review from ruisearch42 April 4, 2025 11:46
@mgoin mgoin added bug Something isn't working v1 labels Apr 4, 2025
@mgoin mgoin changed the title Fix default behavior and fallback for pp in v1 [Bugfix] Fix default behavior and fallback for pp in v1 Apr 4, 2025
@mgoin mgoin changed the title [Bugfix] Fix default behavior and fallback for pp in v1 [Bugfix] Fix default behavior/fallback for pp in v1 Apr 4, 2025
Copy link
Collaborator

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@tlrmchlsmth tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 4, 2025
@tlrmchlsmth tlrmchlsmth enabled auto-merge (squash) April 4, 2025 15:58
@tlrmchlsmth tlrmchlsmth merged commit 4708f13 into vllm-project:main Apr 4, 2025
66 checks passed
Alex4210987 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Apr 5, 2025
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>
lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025
shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants