Skip to content

Conversation

@cyber-pioneer
Copy link
Collaborator

@cyber-pioneer cyber-pioneer commented Apr 18, 2025

Description: Multi-node Prefill/Decode Disaggregated Deployment with FlagCX

This PR implements support for multi-node disaggregated deployment of prefill and decode stages using xPyD Disaggregation:

  • Schedule strategies of PD instances currently support: robin, random. default is robin.
  • It introduces a new communication backend based on FlagCX. Merge FlagCX Adapter.
  • KV cache transfer is enabled via p2pConnector in vLLM.

How to Use

Step 1: Install FlagCX

Step 2: Install the vLLM version from FlagScale

Step 3: Define your config files under ./examples/qwen/conf

Step 4: Launch the distributed deployment

python run.py --config-path ./examples/qwen/conf --config-name config_qwen2.5_7b_disagg_xpyd action=run

Step 5: Send requests to the deployed service

curl -X POST -s http://localhost:10001/v1/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "/models/Qwen2.5-7B-Instruct",
  "prompt": "Introduce Bruce Lee in details",
  "max_tokens": 100,
  "temperature": 0,
  "stream": true
}'

@cyber-pioneer cyber-pioneer requested review from a team and aoyulong as code owners April 18, 2025 02:45
@cyber-pioneer cyber-pioneer changed the title [Serve] Support xPyD Disaggregation in multiple nodes [No merge][Serve] Support xPyD Disaggregation in multiple nodes Apr 18, 2025
@cyber-pioneer cyber-pioneer changed the title [No merge][Serve] Support xPyD Disaggregation in multiple nodes [Serve] Support xPyD Disaggregation in multiple nodes Apr 29, 2025
Copy link
Contributor

@aoyulong aoyulong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aoyulong aoyulong merged commit 495c0a6 into FlagOpen:main Apr 29, 2025
10 of 26 checks passed
@zhaoyinglia zhaoyinglia mentioned this pull request Apr 29, 2025
1 task
aoyulong pushed a commit that referenced this pull request Apr 30, 2025
- [ ] TODO: update #457 for v0.8.5 @cyber-pioneer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants