[Feature]: Generalized the DP feature for ViT and multimodal backbone for the benefit of all models

### 🚀 The feature, motivation and pitch

The existing PRs 
- https://github.com/vllm-project/vllm/pull/18368
- https://github.com/vllm-project/vllm/pull/22697
- https://github.com/vllm-project/vllm/pull/22742

Has clearly shown that hybrid inferencing: (ViT is DP, and LLM is in TP), has greatly reduce the TTFT and improve the overall throughput significantly.

There are multiple reasons that we should have ViT implemented as a DP:
1. The ViT are small models, the TP all reduce incurred a larger overhead than the gain from accelerating through TP.
2. ViT are not captured in cuda graphs or torch compile graph, thus the kernel overhead and all reduce overhead will be higher.

Extending the support to more models:

- #23168
- #23327
- #23876
- #23878
- #23948
- #24955
- #25445

### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Generalized the DP feature for ViT and multimodal backbone for the benefit of all models #22743

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Generalized the DP feature for ViT and multimodal backbone for the benefit of all models #22743

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions