Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum #26

fmassa · 2025-07-02T12:30:03Z

PyTorch currently decomposes any 3d-input nn.Linear (and matmul) into a sequence of view -> mm -> view operations.

This has as a consequence of breaking any type of sharding on both the batch and the sequence dimension, because the flattening that happens doesn't allow to preserve this sharding.

While we wait for PyTorch to avoid decomposing nn.Linear, we instead take the route of pattern-matching the nn.Linear specific occurences, and we replace them with an einsum operator.

We perform this pattern-matching replacement for both the forward as well as the backward pass.

For now, the pass is disabled by default, and can be enabled via a global flag. I'm leaving it disabled for now by default because this actually requires changing some other things like improving the cost model as in #94, so I'm keeping the behavior the same for now while I experiment with the other things more easily

This tries to support CP-style sharding, by overcoming a limitation of DTensor. Doesn't yet work as _mm_strategy is failing

wconstab · 2025-07-02T21:40:14Z

autoparallel/propagation_rules.py

+
+@register_opschema_rule(torch.ops.aten.matmul.default)
+def matmul_rule(mesh, op_schema):
+    # from torch.distributed.tensor._ops._einsum_strategy import gen_einsum_strategies


i would have thought to use the einsum strategies here. for my education, what is the difference between einsum and mm_like in this context? cc @XilunWu

I'll have to end-up using the einsum strategies, because mm_strategies fail :-)

The difference I believe is just that the mm_strategy filter out invalid strategies already, while einsum strategies don't as it doesn't know the size of the tensors, only the number of dimensions

…sa/replace_view_mm_view

Somethings are starting to work, but we are not yet there

…sa/replace_view_mm_view

Requires pytorch/pytorch#157593

Before this, if we had a list of tensors we wouldn't shard the tensors inside the list

autoparallel/graph_utils.py

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement

autoparallel/optimize_sharding.py

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement

…sa/compute_cost_in_comms

…h/autoparallel into fmassa/replace_view_mm_view

Previously, if we had tuple of tensors as an argument to a function, we wouldn't apply any sharding on it. This is split from #26 , where I originally found this issue

* Support tuple of tensors in estimate_strategy_runtime_cost Previously, if we had tuple of tensors as an argument to a function, we wouldn't apply any sharding on it. This is split from #26 , where I originally found this issue * Fix bad copy-paste

…sa/compute_cost_in_comms

…h/autoparallel into fmassa/replace_view_mm_view

…sa/compute_cost_in_comms

…h/autoparallel into fmassa/replace_view_mm_view

… with einsum (#26)" This reverts commit c680107.

[WIP] Replace view -> mm -> view with matmul

d65d06e

This tries to support CP-style sharding, by overcoming a limitation of DTensor. Doesn't yet work as _mm_strategy is failing

fmassa requested review from bdhirsh and wconstab July 2, 2025 12:30

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 2, 2025

wconstab reviewed Jul 2, 2025

View reviewed changes

fmassa added 13 commits July 3, 2025 14:35

Merge branch 'main' of github.com:pytorch-labs/autoparallel into fmas…

d7398c0

…sa/replace_view_mm_view

Fix matmul propagation rule

e7f2003

Somethings are starting to work, but we are not yet there

Merge branch 'main' of github.com:pytorch-labs/autoparallel into fmas…

7cc87e1

…sa/replace_view_mm_view

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

eadc2c3

…sa/replace_view_mm_view

Move function to graph_utils.py

d7dabee

Pull improvements from #29

99abc4e

Fix equation for einsum

48ae195

Cleanup code now that PyTorch has fixed _gen_einsum_strategies

2a2a4d8

Requires pytorch/pytorch#157593

Generalize to more than 3d

b5a5098

Generalize backward pass as well and make everything call into einsum

840690b

Add note about future work

1d443b8

Add einsum flops and generalize creation of sharded tensors

d6c8ae0

Before this, if we had a list of tensors we wouldn't shard the tensors inside the list

Disable erroneous sdpa rule from backward

12155e2

ezyang reviewed Aug 13, 2025

View reviewed changes

autoparallel/graph_utils.py Show resolved Hide resolved

ezyang approved these changes Aug 13, 2025

View reviewed changes

Account for compute cost in collectives as well

25fdd8e

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement

bdhirsh reviewed Aug 13, 2025

View reviewed changes

autoparallel/optimize_sharding.py Outdated Show resolved Hide resolved

fmassa added 6 commits August 13, 2025 18:54

Account for compute cost in collectives as well

d1281a4

This removes a long-standing hack to tell the solver that S(1) -> R is more expensive than S(0) -> R because of an additional data movement

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

299f184

…sa/compute_cost_in_comms

Support getitem as well

a56c784

Improve comments and suppose 80% efficiency

851cf00

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

8396a09

…sa/compute_cost_in_comms

Suppose 70% efficiency for comms

5f4f730

fmassa changed the base branch from main to fmassa/compute_cost_in_comms August 21, 2025 09:33

Merge branch 'fmassa/compute_cost_in_comms' of github.com:meta-pytorc…

2e46457

…h/autoparallel into fmassa/replace_view_mm_view

This was referenced Aug 21, 2025

Support tuple of tensors in estimate_strategy_runtime_cost #102

Merged

Remove invalid S(2) rules from SDPA #103

Merged

fmassa added 4 commits August 21, 2025 18:23

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

a8f435c

…sa/compute_cost_in_comms

Merge branch 'fmassa/compute_cost_in_comms' of github.com:meta-pytorc…

b4ae76d

…h/autoparallel into fmassa/replace_view_mm_view

Merge branch 'main' of github.com:meta-pytorch/autoparallel into fmas…

e025188

…sa/compute_cost_in_comms

Merge branch 'fmassa/compute_cost_in_comms' of github.com:meta-pytorc…

10219c9

…h/autoparallel into fmassa/replace_view_mm_view

fmassa changed the title ~~[WIP] Replace view -> mm -> view with matmul~~ Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum Aug 27, 2025

fmassa marked this pull request as ready for review August 27, 2025 13:13

Add comment and set it to false by default

e3c5e9f

fmassa changed the base branch from fmassa/compute_cost_in_comms to main August 27, 2025 13:20

fmassa added 2 commits August 27, 2025 13:26

Revert changes from another PR

4b43944

Add spaces back

5d434fd

fmassa merged commit c680107 into main Aug 27, 2025
6 checks passed

fmassa deleted the fmassa/replace_view_mm_view branch August 27, 2025 13:44

wconstab added a commit that referenced this pull request Aug 27, 2025

Revert "Avoid nn.Linear decomposition by replacing view -> mm -> view…

24395f9

… with einsum (#26)" This reverts commit c680107.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum #26

Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum #26

fmassa commented Jul 2, 2025 •

edited

Loading

Uh oh!

wconstab Jul 2, 2025

Uh oh!

fmassa Jul 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum #26

Avoid nn.Linear decomposition by replacing view -> mm -> view with einsum #26

Conversation

fmassa commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wconstab Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

fmassa Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

fmassa commented Jul 2, 2025 •

edited

Loading