Skip to content

[RFC]: P/D Disaggregation Support #841

@MengqingCao

Description

@MengqingCao

Motivation.

P/D Disaggregation plays a very important role in deploying vllm inference services in large-scale clusters. There is already a initial P/D Disaggregation support in vllm-ascend now, and we' ll continue to develop it with more parrallel mechanisms including tp, ep and dp, and graph mode integration, etc.

The related CI for 1p1d, xpyd scenarios will be integrated step by step, with or w/o parrallel mechanisms including tp, ep, dp, etc.

Proposed Change.

P/D Disaggregation

CI Machine Preparation

UT Integration

Feature coverage matrix

P/D Disaggregation tp ep dp
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
  • Basic P/D Disaggregation w/o parrallel mechanisms.
  • Adding the above parrallel mechanisms.
  • Adding graph mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRequest For Comments

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions