-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Labels
feature requestNew feature or requestNew feature or request
Description
🚀 The feature, motivation and pitch
Can we add support for pipeline parallelism for this model?
(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:742] Resolved architecture: Ovis2_5
(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:1774] Using max model len 40960
(APIServer pid=42) INFO 08-21 10:25:01 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=2048.
(APIServer pid=42) Traceback (most recent call last):
(APIServer pid=42) File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=42) sys.exit(main())
(APIServer pid=42) ^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=42) args.dispatch_function(args)
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 50, in cmd
(APIServer pid=42) uvloop.run(run_server(args))
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 109, in run
(APIServer pid=42) return __asyncio.run(
(APIServer pid=42) ^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=42) return runner.run(main)
(APIServer pid=42) ^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=42) return self._loop.run_until_complete(task)
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=42) return await main
(APIServer pid=42) ^^^^^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1918, in run_server
(APIServer pid=42) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1938, in run_server_worker
(APIServer pid=42) async with build_async_engine_client(
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42) return await anext(self.gen)
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 178, in build_async_engine_client
(APIServer pid=42) async with build_async_engine_client_from_engine_args(
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42) return await anext(self.gen)
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 204, in build_async_engine_client_from_engine_args
(APIServer pid=42) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=42) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1384, in create_engine_config
(APIServer pid=42) config = VllmConfig(
(APIServer pid=42) ^^^^^^^^^^^
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 120, in __init__
(APIServer pid=42) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 3562, in __post_init__
(APIServer pid=42) self.model_config.verify_with_parallel_config(self.parallel_config)
(APIServer pid=42) File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 1349, in verify_with_parallel_config
(APIServer pid=42) raise NotImplementedError(
(APIServer pid=42) NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the `SupportsPP` interface.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request