[Feature]: Support pipeline parallelism for AIDC-AI/Ovis2.5-9B

### 🚀 The feature, motivation and pitch

Can we add support for pipeline parallelism for this model?

```
(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:742] Resolved architecture: Ovis2_5
(APIServer pid=42) INFO 08-21 10:25:00 [__init__.py:1774] Using max model len 40960
(APIServer pid=42) INFO 08-21 10:25:01 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=2048.
(APIServer pid=42) Traceback (most recent call last):
(APIServer pid=42)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=42)     sys.exit(main())
(APIServer pid=42)              ^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=42)     args.dispatch_function(args)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 50, in cmd
(APIServer pid=42)     uvloop.run(run_server(args))
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 109, in run
(APIServer pid=42)     return __asyncio.run(
(APIServer pid=42)            ^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=42)     return runner.run(main)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=42)     return self._loop.run_until_complete(task)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=42)     return await main
(APIServer pid=42)            ^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1918, in run_server
(APIServer pid=42)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1938, in run_server_worker
(APIServer pid=42)     async with build_async_engine_client(
(APIServer pid=42)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42)     return await anext(self.gen)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 178, in build_async_engine_client
(APIServer pid=42)     async with build_async_engine_client_from_engine_args(
(APIServer pid=42)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=42)     return await anext(self.gen)
(APIServer pid=42)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 204, in build_async_engine_client_from_engine_args
(APIServer pid=42)     vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=42)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1384, in create_engine_config
(APIServer pid=42)     config = VllmConfig(
(APIServer pid=42)              ^^^^^^^^^^^
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 120, in __init__
(APIServer pid=42)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 3562, in __post_init__
(APIServer pid=42)     self.model_config.verify_with_parallel_config(self.parallel_config)
(APIServer pid=42)   File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 1349, in verify_with_parallel_config
(APIServer pid=42)     raise NotImplementedError(
(APIServer pid=42) NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the `SupportsPP` interface.
```

### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Support pipeline parallelism for AIDC-AI/Ovis2.5-9B #23355

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Support pipeline parallelism for AIDC-AI/Ovis2.5-9B #23355

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions