Skip to content

Conversation

@IvanKobzarev
Copy link
Contributor

@IvanKobzarev IvanKobzarev commented Sep 29, 2025

Stacked PRs:


[WIP][asynctp] Incentivize in solver asyncTP-able redistributions

IvanKobzarev added a commit that referenced this pull request Sep 29, 2025
stack-info: PR: #168, branch: IvanKobzarev/stack/2
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 29, 2025
@IvanKobzarev IvanKobzarev changed the base branch from IvanKobzarev/stack/1 to main September 29, 2025 11:13
IvanKobzarev added a commit that referenced this pull request Sep 29, 2025
stack-info: PR: #168, branch: IvanKobzarev/stack/2
@IvanKobzarev IvanKobzarev changed the base branch from main to IvanKobzarev/stack/1 September 29, 2025 11:13
@fmassa
Copy link
Contributor

fmassa commented Sep 29, 2025

Discussed in person with @IvanKobzarev , and we will be keeping this PR on hold for now.

The main reason is that this PR is adding an "offset" cost for redistributions that can be hidden through AsyncTP, favoring the solver to yield those solutions.
But there are other cases where this logic could also be done (e.g., FSDP prefetching could hide almost all redistribution cost from parameters).
If we instead keep the cost as just the sum of comms + compute, then the solver is minimizing an upper bound on the total runtime.

If we start to specialize to specific cases, I would prefer to do it in a more holistic manner, instead of doing it for asynctp only.

stack-info: PR: #168, branch: IvanKobzarev/stack/2
@IvanKobzarev IvanKobzarev changed the base branch from IvanKobzarev/stack/1 to main September 29, 2025 13:59
@IvanKobzarev IvanKobzarev changed the base branch from main to IvanKobzarev/stack/1 September 29, 2025 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants