Skip to content

Conversation

@wconstab
Copy link
Contributor

Turns out the previous PR
#37

was not correct. It divided the wrong dim's stride.

This PR divides the dim to the left of the one being sharded, which is what really happens.

  • verified this fixes a grouped_mm striding error on deepseek enablement PR

Note: that we have this util at all is worrying me. Why don't we just use dtensors to propagate?

Turns out the previous PR
#37

was not correct. It divided the wrong dim's stride.

This PR divides the dim to the left of the one being sharded, which is
what really happens.

Note: that we have this util at all is worrying me. Why don't we just
use dtensors to propagate?
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 18, 2025
@wconstab wconstab requested review from ezyang and fmassa July 18, 2025 00:13
Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable. I have a comment in this function mentioning DTensor, I think running DTensor under fake mode should be fine as well

@fmassa fmassa merged commit 8fbdba7 into main Jul 18, 2025
6 checks passed
@fmassa fmassa deleted the whc/fix_stride branch July 18, 2025 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants