-
Notifications
You must be signed in to change notification settings - Fork 87
Open
Labels
enhancementNew feature or requestNew feature or request
Description
During Inference, we see the following scenarios:
- NIXL/UCX used for P2P comms for PD disaggregation (KV Cache transfers).
- NVSHMEM used for AlltoAll comms for MOE (https://github.com/ppl-ai/pplx-kernels). However, underlying transport is via UCX.
UCCL's transport optimization could apply to the above scenarios too.
As part of enhancement, would like to explore how UCCL can be integrated into the inference transport stack.
One solution could be to develop UCX plugin (UCT) with UCCL transport, which can solve both P2P and AllToAll for inference.
tjtanaa
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
