preserve tensor striding during compute estimation #37

bdhirsh · 2025-07-10T00:16:34Z

the compute estimation in autoparallel uses torch.empty to allocate tensors, which is wrong if the input tensor has a specific striding.

This came up trying to run llama3 with float8 quantization, because:

(1) float8linears desugar into a call to aten._scaled_mm

(2) aten._scaled_mm requires its second input to be column-major, and its assert was failing (code: https://github.com/pytorch/pytorch/blob/main/torch/_meta_registrations.py#L6453)

repro code: checkout this branch pytorch/torchtitan#1378, and run:

CONFIG_FILE="./torchtitan/models/llama3/train_configs/debug_model.toml" ./run_train.sh --model.name llama3_auto_parallel --parallelism.tensor_parallel_degree 4 --model.converters="float8"

wconstab · 2025-07-17T22:29:17Z

lgtm. and pushed a lint fix

Turns out the previous PR #37 was not correct. It divided the wrong dim's stride. This PR divides the dim to the left of the one being sharded, which is what really happens. Note: that we have this util at all is worrying me. Why don't we just use dtensors to propagate?

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 10, 2025

bdhirsh mentioned this pull request Jul 10, 2025

add float8 support pytorch/torchtitan#1378

Open

ezyang approved these changes Jul 12, 2025

View reviewed changes

preserve tensor striding during compute estimation

afa0050

wconstab force-pushed the scaled_mm_fix branch from 79376e2 to afa0050 Compare July 17, 2025 22:29

Fix other callsites

0208656

wconstab force-pushed the scaled_mm_fix branch from 00a1038 to 0208656 Compare July 17, 2025 22:42

wconstab merged commit 1af2a14 into main Jul 17, 2025
6 checks passed

wconstab deleted the scaled_mm_fix branch July 17, 2025 22:58

wconstab mentioned this pull request Jul 18, 2025

Fix stride computation formula used during compute estimation #42

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

preserve tensor striding during compute estimation #37

preserve tensor striding during compute estimation #37

Uh oh!

bdhirsh commented Jul 10, 2025

Uh oh!

wconstab commented Jul 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

preserve tensor striding during compute estimation #37

preserve tensor striding during compute estimation #37

Uh oh!

Conversation

bdhirsh commented Jul 10, 2025

Uh oh!

wconstab commented Jul 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants