Fix stride computation formula used during compute estimation (#42)

wconstab · web-flow · commit 8fbdba76341d · 2025-07-18T13:23:07.000+02:00
Turns out the previous PR #37 was not correct. It divided the wrong dim's stride. This PR divides the dim to the left of the one being sharded, which is what really happens. Note: that we have this util at all is worrying me. Why don't we just use dtensors to propagate?
diff --git a/autoparallel/compute_estimation.py b/autoparallel/compute_estimation.py
@@ -169,9 +169,10 @@ def _get_sharded_shape_stride(spec):
         if placement.is_shard():
             dim = placement.dim
             new_tensor_shape[dim] = (new_tensor_shape[dim] + mesh_size - 1) // mesh_size
-            new_tensor_stride[dim] = (
-                new_tensor_stride[dim] + mesh_size - 1
-            ) // mesh_size
+            if dim - 1 > 0:
+                new_tensor_stride[dim - 1] = (
+                    new_tensor_stride[dim - 1] + mesh_size - 1
+                ) // mesh_size
     return new_tensor_shape, new_tensor_stride