Skip to content

[Feature]: Support pipeline parallelism for AIDC-AI/Ovis2.5-9B #23355

@VivekMalipatel

Description

@VivekMalipatel

🚀 The feature, motivation and pitch

Can we add support for pipeline parallelism for this model?

(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:742] Resolved architecture: Ovis2_5
(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:1774] Using max model len 40960
(APIServer pid=42) INFO 08-21 10:25:01 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=2048.
(APIServer pid=42) Traceback (most recent call last):
(APIServer pid=42)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=42)     sys.exit(main())
(APIServer pid=42)              ^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=42)     args.dispatch_function(args)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 50, in cmd
(APIServer pid=42)     uvloop.run(run_server(args))
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 109, in run
(APIServer pid=42)     return __asyncio.run(
(APIServer pid=42)            ^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=42)     return runner.run(main)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=42)     return self._loop.run_until_complete(task)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=42)     return await main
(APIServer pid=42)            ^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1918, in run_server
(APIServer pid=42)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1938, in run_server_worker
(APIServer pid=42)     async with build_async_engine_client(
(APIServer pid=42)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42)     return await anext(self.gen)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 178, in build_async_engine_client
(APIServer pid=42)     async with build_async_engine_client_from_engine_args(
(APIServer pid=42)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42)     return await anext(self.gen)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 204, in build_async_engine_client_from_engine_args
(APIServer pid=42)     vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=42)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1384, in create_engine_config
(APIServer pid=42)     config = VllmConfig(
(APIServer pid=42)              ^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 120, in __init__
(APIServer pid=42)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 3562, in __post_init__
(APIServer pid=42)     self.model_config.verify_with_parallel_config(self.parallel_config)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 1349, in verify_with_parallel_config
(APIServer pid=42)     raise NotImplementedError(
(APIServer pid=42) NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the `SupportsPP` interface.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions