[asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions #151

IvanKobzarev · 2025-09-19T15:44:32Z

Fork of torch/_inductor/fx_passes/micro_pipeline_tp.py, torch/distributed/_symmetric_memory/__init__.py for fast experimentation.

PRs with changes on the top of base version to PyTorch repo:

Changes to Solver

Communication Cost reduction -10% (heuristic based on AsyncTP post in discussion , needs remeasure):

2.1 matmul + reduce_scatter (Partial -> Shard(dim), dim is not the last of matmul
(the last dim also supported, but has additional restride .contiguous() inside)

2.2 ag + matmul Shard(dim) -> R for argument_A of matmul, dim is not the last dim that will be reduced)

incentivize asynctp

fmassa · 2025-09-23T16:23:56Z

Shared this on chat.

Let me go over this carefully once I'm back from PTO.

IMO having "async-tp" as an option inside AutoParallel seems counterintuitive to me, but let me look into this carefully next week.

fmassa · 2025-09-30T07:06:15Z

Subsumed by #167 and #168

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 19, 2025

IvanKobzarev changed the title ~~[asynctp] Knobs to enable asynctp; Adding constraints to solver to incentivize asynctp agmm, mmrs~~ [WIP][asynctp] Knobs to enable asynctp; Adding constraints to solver to incentivize asynctp agmm, mmrs Sep 19, 2025

IvanKobzarev force-pushed the async_tp branch 2 times, most recently from 5d54740 to 1d4a56d Compare September 22, 2025 08:18

IvanKobzarev changed the title ~~[WIP][asynctp] Knobs to enable asynctp; Adding constraints to solver to incentivize asynctp agmm, mmrs~~ [asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions Sep 22, 2025

IvanKobzarev force-pushed the async_tp branch 5 times, most recently from 8f8ddd5 to ef3a9a9 Compare September 22, 2025 15:57

IvanKobzarev requested review from ezyang and fmassa September 22, 2025 15:58

IvanKobzarev force-pushed the async_tp branch 2 times, most recently from b981eec to c46a0f6 Compare September 22, 2025 18:05

[asynctp] Knobs to enable asynctp; Adding constraints to solver to

11649aa

incentivize asynctp

IvanKobzarev force-pushed the async_tp branch from c46a0f6 to 11649aa Compare September 22, 2025 19:33

fmassa closed this Sep 30, 2025

fmassa deleted the async_tp branch September 30, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions #151

[asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions #151

Uh oh!

IvanKobzarev commented Sep 19, 2025 •

edited

Loading

Uh oh!

fmassa commented Sep 23, 2025

Uh oh!

fmassa commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions #151

[asynctp] Async_tp pass and ops fork + changes; Solver addition to incentivize async_tp fusable redistributions #151

Uh oh!

Conversation

IvanKobzarev commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa commented Sep 23, 2025

Uh oh!

fmassa commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IvanKobzarev commented Sep 19, 2025 •

edited

Loading