Skip to content

Conversation

@liziyu179
Copy link
Contributor

@liziyu179 liziyu179 commented Jul 31, 2025

@liziyu179 liziyu179 force-pushed the refactor_pd_with_prefill branch 2 times, most recently from e572c87 to b7966ea Compare July 31, 2025 13:13
…n situations

Signed-off-by: liziyu <liziyu16@huawei.com>
@liziyu179 liziyu179 force-pushed the refactor_pd_with_prefill branch from b7966ea to 9bff5c7 Compare August 1, 2025 01:34
@ganyi1996ppo ganyi1996ppo merged commit 92e6aa9 into vllm-project:v0.9.1-dev Aug 1, 2025
17 checks passed
liyu119 added a commit to rjg-lyh/vllm-ascend that referenced this pull request Aug 11, 2025
…nto qwen30-dev

* 'qwen30-dev' of https://github.com/rjg-lyh/vllm-ascend:
  [V0.9.1] Replace FA ops with FA_V2 to optimize perf
  [0.9.1]remove chunked_prefill_for_mla (vllm-project#2177)
  move with_prefill allreduce from cpu to npu (vllm-project#2230)
  [v0.9.1] Add release note for v0.9.1rc2 (vllm-project#2233)
  [Docs] Sync main doc to v0.9.1-dev (vllm-project#2227)
  [0.9.1] Enable external distributed dp deployments in vllm ascend(0.9.1 only) (vllm-project#2109)
  [V0.9.1][BugFix] Fix the bug in decoraotor patch (vllm-project#2199)
  [v0.9.1][Bugfix][PD] Auto-clear producer KV cache if no pull notification (vllm-project#2085)
  [BUGFIX][0.9.1] FIX ring_mla input ‘query_lens’ to cpu (vllm-project#2170)
  [0.9.1][Prefill Perf] add D2H & initRoutingQuantV2 (vllm-project#2038)
  [bugfix] add with_prefill cpu allreduce to handle D-node recomputatio… (vllm-project#2129)
@liziyu179 liziyu179 deleted the refactor_pd_with_prefill branch August 23, 2025 09:48
with_prefill_tensor = torch.tensor([with_prefill],
device="cpu",
dtype=torch.bool)
dist.all_reduce(with_prefill_tensor,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the performance impact of this been tested? Can it be recovered using the allreduce communication in num_tokens_across_dp?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants