Account for compute cost in collectives during redistribution #125

fmassa · 2025-08-29T12:42:58Z

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement.

Indeed, when performing S(1) -> R, we currently perform an all-gather on dim 0, and then a full copy of the data. This wasn't modelled properly before (we just multiplied the comm cost by an arbitrary factor of 4), now this is taken properly into account.

We also more correctly model the all-to-all cost now, although there is a *5 scaling factor that was added but which needs to be improved and I just added temporarily to get this merged.

This PR subsumes #94, as we now have our own redistribution function.

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement.

…sa/compute_cost_in_comms_v2

This needs to be improved

…sa/compute_cost_in_comms_v2

Account for compute cost in collectives during redistribution

49425b1

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement.

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 29, 2025

fmassa mentioned this pull request Sep 4, 2025

Add manual constraints in Llama3 example #130

Merged

fmassa added 4 commits September 7, 2025 05:43

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

b3b0e79

…sa/compute_cost_in_comms_v2

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

341ee30

…sa/compute_cost_in_comms_v2

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

9e896a6

…sa/compute_cost_in_comms_v2

Add better cost model to compute part

b4ae956

fmassa marked this pull request as ready for review September 10, 2025 14:06

fmassa added 4 commits September 10, 2025 14:44

Tweak a2a cost for now

bc849f2

This needs to be improved

Switch to using alias_v2

90f47a2

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

f685813

…sa/compute_cost_in_comms_v2

Adapt example with new alias policy

dca8cef

fmassa mentioned this pull request Sep 10, 2025

Account for compute cost in collectives during redistribution #94

Closed

fmassa merged commit cd27579 into main Sep 10, 2025
6 checks passed

fmassa deleted the fmassa/compute_cost_in_comms_v2 branch September 10, 2025 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Account for compute cost in collectives during redistribution #125

Account for compute cost in collectives during redistribution #125

Uh oh!

fmassa commented Aug 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Account for compute cost in collectives during redistribution #125

Account for compute cost in collectives during redistribution #125

Uh oh!

Conversation

fmassa commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fmassa commented Aug 29, 2025 •

edited

Loading