Add factory_strategy to support empty, full, ... #44

wconstab · 2025-07-19T14:21:11Z

Moves from ad-hoc and incomplete support for a couple of these ops to supporting all of the standard factory ops with sharding support.

Still needs further work to add memory constraints to encourage sharding.

fmassa

Had a first pass, I think it is looking pretty good and close to be merged after it has been rebased

fmassa · 2025-07-23T10:56:36Z

autoparallel/propagation_rules.py

+            # TODO: there shouldn't actually be a row here, since there is no input to the op and the rows correspond
+            # to the inputs. However, the optimization code is not set up to tolerate input-less ops, so hack around it
+            # (see "/data/users/whc/autoparallel/autoparallel/optimize_sharding.py", line 226, in walk_over_options)


Yes, I agree with the comment and the current solution you implemented is what I would recommend doing as well.

fmassa · 2025-07-23T10:57:42Z

autoparallel/utils.py

+                # but index_put op insists on looking at 'input_specs' of its input, which seems absurd.
+                # so just copy it for now and fix later
+                strat.input_specs = (strat.output_specs,)


Is it still the case even after using the index_put from PyTorch main? Given that we have removed our custom rule in #43

good point ill have to check, i may be able to remove this.

i am removing this for now, because it is definitely not needed for the tests/examples landed on main. If we run into it on DS3 branch we can revisit a better fix.

fmassa · 2025-07-23T10:58:31Z

autoparallel/propagation_rules.py

+        assert isinstance(dtype, torch.dtype), dtype
+
+    # TODO: ensure the solver knows that it is more expensive to Replicate factory functions than shard
+    # for now, put replicate last since this might encourage sharding :?


replicate last since this might encourage sharding

Hum, I'm not sure we have guarantees wrt that... in any case, let's see

yea, i'll make the comment clearer: ordering did appear to change the outcome locally for my experiment, but i agree its not a stable guarantee and we should try to deal with this a proper way. i was mainly happy that i could ensure that sharding of factories was at least happening so i could test that it worked.

fmassa · 2025-07-23T11:01:55Z

autoparallel/propagation_rules.py

+
+
+@register_opschema_rule(TENSOR_FACTORY_OPS)
+def factory_rule(mesh, op_schema: OpSchema) -> OpStrategy:


Maybe we could replace / unify this implementation with the implementation in _create_all_options https://github.com/pytorch-labs/autoparallel/blob/b53ad103b2054177db1c0ac50d0b0021a5b8bb57/autoparallel/propagation_rules.py#L119-L159 ?

Can be left for the future, just bringing this up as I think they are quite similar

i wasn't aware of these, agree we can refactor these to remove the duplication. I can do that in a follow up PR. A few questions

what is "_create_all_options_no_nested_sharding" intended for? It is currently unused so i might delete it unless you have a use case in mind

"_create_all_options" is pretty close to 'factory rule'. I can probably make factory_rule delegate to _create_all_options.

_create_all_options is a bit vague and the todo suggests adding partial. but factories would not want partial. I'll think a bit more about how to factor the function

About your points:

what is "_create_all_options_no_nested_sharding" intended for? It is currently unused so i might delete it unless you have a use case in mind

I think we can remove it. It was the first implementation that I did because I was learning DTensor on the go and was using DTensorSpec.from_dim_map to generate the specs. But then I realized that it couldn't generate nested shardings like S(0)S(0), so I wrote the second function. I didn't know back then if this would be useful or not so I kept it just in case, but I think we can remove it now

"_create_all_options" is pretty close to 'factory rule'. I can probably make factory_rule delegate to _create_all_options.

Yes, I think it would be a good thing to unify both

_create_all_options is a bit vague and the todo suggests adding partial. but factories would not want partial. I'll think a bit more about how to factor the function

That comment is legacy from _create_all_options_no_nested_sharding which went into _create_all_options -- I don't think we want Partial support anymore, so it's basically the same thing as factory_rule

autoparallel/utils.py

fmassa · 2025-07-23T11:11:09Z

autoparallel/utils.py

+from .propagation_rules import _op_partial_rules, _op_rules, remove_invalid_configs, TENSOR_FACTORY_OPS


 def propagate_tensor_meta(op, user_args, user_kwargs, out_strat):


nit: as a follow-up work, it might be good to disable those functions to see if we are still missing some cases in DTensor sharding propagation

disable which, 'propagate_tensor_meta'? I can try that in another PR.

autoparallel/propagation_rules.py

Moves from ad-hoc and incomplete support for a couple of these ops to supporting all of the standard factory ops with sharding support. Still needs further work to add memory constraints to encourage sharding.

fmassa

LGTM, thanks!

Let's think about unifying _create_all_options in another PR

wconstab requested a review from fmassa July 19, 2025 14:21

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 19, 2025

wconstab force-pushed the whc/factory branch 2 times, most recently from 6870a7f to 304ebc6 Compare July 19, 2025 14:26

fmassa reviewed Jul 23, 2025

View reviewed changes

wconstab force-pushed the whc/factory branch from 304ebc6 to bdf5cde Compare July 23, 2025 22:38

zpcore mentioned this pull request Jul 24, 2025

Add DeepSeekV3 and figure out what's needed to support it #29

Draft

13 tasks

Add factory_strategy to support empty, full, ...

5b3cacf

Moves from ad-hoc and incomplete support for a couple of these ops to supporting all of the standard factory ops with sharding support. Still needs further work to add memory constraints to encourage sharding.

wconstab force-pushed the whc/factory branch from bdf5cde to 5b3cacf Compare July 24, 2025 21:55

wconstab requested a review from fmassa July 24, 2025 22:15

fmassa approved these changes Jul 25, 2025

View reviewed changes

fmassa merged commit 6be7804 into main Jul 25, 2025
5 checks passed

fmassa deleted the whc/factory branch July 25, 2025 08:39



		@register_opschema_rule(TENSOR_FACTORY_OPS)
		def factory_rule(mesh, op_schema: OpSchema) -> OpStrategy:

		from .propagation_rules import _op_partial_rules, _op_rules, remove_invalid_configs, TENSOR_FACTORY_OPS


		def propagate_tensor_meta(op, user_args, user_kwargs, out_strat):

Add factory_strategy to support empty, full, ... #44

Add factory_strategy to support empty, full, ... #44

Uh oh!

Conversation

wconstab commented Jul 19, 2025

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants