[bugfix] add with_prefill cpu allreduce to handle D-node recomputatio… #2129
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…n situations
What this PR does / why we need it?
Add with-prefill CPU AllReduce to handle D-node recomputation situations.
Does this PR introduce any user-facing change?
How was this patch tested?
gsm8k
http://image.huawei.com/tiny-lts/v1/images/mdstorm/dcbc43b858db666f185d73868f7933fb_1242x502.png
livecodebench
http://image.huawei.com/tiny-lts/v1/images/mdstorm/78a2e9695c3d841870d02c840f032154_1242x502.png
vllmbeachmark
http://image.huawei.com/tiny-lts/v1/images/mdstorm/a4d32f4f2d702cf89854b83ae4d58337_1242x502.png
performance
http://image.huawei.com/tiny-lts/v1/images/mdstorm/38e194a09c3c9ae902a3772f1dca6862_1609x1095.png
http://image.huawei.com/tiny-lts/v1/images/mdstorm/6ffaeb7a6e4672ebbb0b22139b4c72f4_1634x1096.png