Skip to content

Conversation

@danielvegamyhre
Copy link
Contributor

@danielvegamyhre danielvegamyhre commented Aug 1, 2025

Once pytorch/torchtitan#1517 lands we need to:

  • remove the temporary workaround of dynamic memory layout transformation from row-major to col-major before every grouped gemm, and just assert weights are col-major.
  • preserve subclass through transpose ops, so the weight.transpose(-2, -1) op doesn't lose the subclass and thus no override to fp8 grouped gemm is done

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2663

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Pending

As of commit ec07a5c with merge base 7dbc816 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 1, 2025
@danielvegamyhre danielvegamyhre added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Aug 1, 2025
@danielvegamyhre danielvegamyhre changed the title [MoE training] Assert expert weights are column-major [MoE training] Assert expert weights are column-major; preserve subclass with transpose Aug 1, 2025
@danielvegamyhre
Copy link
Contributor Author

cc @vkuzo @drisspg this is ready for review, the associated titan PR has landed pytorch/torchtitan#1517

@danielvegamyhre
Copy link
Contributor Author

confirmed test failures are unrelated, and we are looking into these red CI tests to get them fixed

@danielvegamyhre danielvegamyhre merged commit b757fb9 into main Aug 4, 2025
18 of 20 checks passed
liangel-02 pushed a commit that referenced this pull request Aug 25, 2025
…ass with transpose (#2663)

* assert B is col-major

* preserve subclass with transpose
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. moe topic: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants