[EXIR] Register _clone_dim_order op and map aten.clone #12971

keyprocedure · 2025-07-29T19:36:41Z

Summary

This is PR 2 of 3 implementing a dim order aware clone op.

This PR registers the new _clone_dim_order op and maps aten.clone ops to dim_order_ops._clone_dim_order in EXIR during export to preserve memory layout changes (contiguous/channels_last). It also updates Core ML and ARM backends to handle the new clone op.

Related PRs:

PR 1: #12974 - Add _clone_dim_order portable kernel
PR 3: #12976 - Update RemoveCloneOpsTransform to be dim order aware

Fixes #12645

Test plan

Operator level tests to verify kernel behavior for layout preservation and changes.
Graph level checks to confirm that clone mapping occurs.
End to end tests to validate that functional clone behavior is unchanged.

All tests pass via:
python -m unittest exir.tests.test_memory_format_ops_pass
python -m unittest backends.apple.coreml.test.test_torch_ops
pytest backends/arm/test/ops/test_clone.py
pytest backends/arm/test/passes/test_remove_clone_pass.py

pytorch-bot · 2025-07-29T19:36:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12971

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Multiple CI trunk failures after landing https://github.com/pytorch/pytorch/pull/161002

✅ No Failures

As of commit 6839212 with merge base 49bc664 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Gasoonjia

LGTM! Left a minor feedback about test coverage!

Gasoonjia · 2025-07-29T19:51:40Z

exir/tests/test_memory_format_ops_pass.py

@@ -389,3 +390,17 @@ def test_mobilenet_v3_xnnpack(self) -> None:
                rtol=1e-3,
            ),
        )
+
+    def test_op_clone_dim_order_registration(self):


Can you follow other testcases way (e.g, test_op_dim_order_propagation), to reuse the memory_format_test_runner function and test at least two different scenarios: one for maintaining dim order, the other is converting the dim order?

I can definitely add more tests, the reason I didn't here was because the kernel implementation wasn't in place for this PR. I'll work on adding additional coverage.

keyprocedure · 2025-07-29T20:25:51Z

@pytorchbot label "release notes: none"

keyprocedure · 2025-07-29T23:35:48Z

@Gasoonjia splitting the clone_dim_order op implementation into 3 PRs was a great suggestion, it makes it much easier to isolate and debug issues.

The CI failures are mainly due to two issues:

clone is now mapped to clone_dim_order, so tests expecting executorch_exir_dialects_edge__ops_aten_clone_default are failing because they now see executorch_exir_dialects_edge__ops_dim_order_ops__clone_dim_order_default.
Because of the first issue, the lack of a kernel implementation for clone_dim_order in this PR is also triggering failures.

Do you have any suggestions for how we should handle the current approach of mapping clone to clone_dim_order?

Gasoonjia · 2025-07-31T09:08:20Z

Hi @keyprocedure thanks for your update!

For places checking the executorch_exir_dialects_edge__ops_aten_clone_default, please update them to check executorch_exir_dialects_edge__ops_dim_order_ops__clone_dim_order_default as long as the compile_config.skip_dim_order == False in their pipeline.

the lack of a kernel implementation for clone_dim_order

can you share me more insights here? I think you've create clone_dim_order kernel in aot side by reusing aten clone op.

keyprocedure · 2025-07-31T18:43:01Z

@Gasoonjia during export _clone_dim_order does use ATen's clone implementation because of the Python fallback, but at runtime the graph contains _clone_dim_order and since no kernel is registered for it, it fails with:
[operator_registry.cpp:256] kernel 'dim_order_ops::_clone_dim_order.out' not found.

This is also why I placed the tests using memory_format_test_runner in the kernel PR instead of this one.

What are your thoughts?

Gasoonjia · 2025-07-31T19:12:33Z

@keyprocedure thanks for detailed reply!

let's swap the order of PRs. We can first finish the runtime op, then aot replacement, finally RemoveCloneOpsTransform.
That would solve the runtime operator requirement issue.

keyprocedure · 2025-07-31T20:04:51Z

@keyprocedure thanks for detailed reply!

let's swap the order of PRs. We can first finish the runtime op, then aot replacement, finally RemoveCloneOpsTransform. That would solve the runtime operator requirement issue.

That's a great idea! Should I move the tests from the runtime/kernel PR (since the new op won't be registered yet) to the aot replacement PR?

Gasoonjia · 2025-07-31T20:08:23Z

@keyprocedure thanks for detailed reply!
let's swap the order of PRs. We can first finish the runtime op, then aot replacement, finally RemoveCloneOpsTransform. That would solve the runtime operator requirement issue.

That's a great idea! Should I move the tests from the runtime/kernel PR (since the new op won't be registered yet) to the aot replacement PR?

Yes please! More specific:

for the runtime pr, it should only contain:

the portable clone_dim_order op
the operator test itself

for aot pr, it should contains both graph-level check, and the end2end test using pybinding.

keyprocedure · 2025-08-04T17:12:27Z

Hi @Gasoonjia

I’ve updated the PRs, the kernel PR is ready for review.

The CI failures from this PR that weren’t due to the missing _clone_dim_order kernel are all ARM related runtime failures:

Expected to not find "torch.ops.higher_order.executorch_call_delegate" but found it
Expected to find "torch.ops.higher_order.executorch_call_delegate" but did not find it
Expected out tensor to have dtype signed char, but got float instead
Expected to find "executorch_exir_dialects_edge__ops_aten_clone_default" but did not find it

It seems like these failures share the same root cause: aten.clone being replaced by _clone_dim_order, but since the failures are spread across quite a few tests, I'm not sure which strategy we should use to address this.
Does it make sense to individually try to correct each failing test or is there a way to gate the ARM backend to still use the
aten.clone op?

Any insights or guidance would be helpful

Gasoonjia · 2025-08-04T19:20:25Z

@keyprocedure
Thanks for sharing me the error msg!
I've left some comment for the kernel PR, please take a look.

For the issue in this PR, can you share me the code pointer to the issue? Like which line raising that?

keyprocedure · 2025-08-04T21:17:26Z

There were a total of 49 failing tests for the initial CI run, so I listed out the test and line number of each failure:

unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job

19 failures

backends/arm/test/models/stable_diffusion/test_T5EncoderModel.py:100:
    TestT5EncoderModel.test_T5EncoderModel_tosa_BI

"AssertionError: Invalid partition, found dependency cycles"

backends/arm/test/models/stable_diffusion/test_T5EncoderModel.py:80:
    TestT5EncoderModel.test_T5EncoderModel_tosa_MI

"IndexError: list index out of range"

backends/arm/test/models/stable_diffusion/test_CLIPTextModelWithProjection.py:76:
    TestCLIPTextModelWithProjection.test_CLIPTextModelWithProjection_tosa_MI

backends/arm/test/models/stable_diffusion/test_SD3Transformer2DModel.py:128:
    TestSD3Transformer2DModel.test_SD3Transformer2DModel_tosa_BI

backends/arm/test/models/stable_diffusion/test_SD3Transformer2DModel.py:105:
    TestSD3Transformer2DModel.test_SD3Transformer2DModel_tosa_MI

backends/arm/test/models/stable_diffusion/test_vae_AutoencoderKL.py:74:
    TestAutoencoderKL.test_AutoencoderKL_tosa_BI

backends/arm/test/models/stable_diffusion/test_vae_AutoencoderKL.py:55:
    TestAutoencoderKL.test_AutoencoderKL_tosa_MI

backends/arm/test/models/test_conformer.py:60:
    test_conformer_tosa_MI

backends/arm/test/models/test_deit_tiny_arm.py:45:
    test_deit_tiny_tosa_MI

backends/arm/test/models/test_deit_tiny_arm.py:58:
    test_deit_tiny_tosa_BI

backends/arm/test/models/test_dl3_arm.py:44:
    test_dl3_tosa_MI

backends/arm/test/models/test_dl3_arm.py:57:
    test_dl3_tosa_BI

backends/arm/test/models/test_mobilenet_v2_arm.py:45:
    test_mv2_tosa_MI

backends/arm/test/models/test_mobilenet_v2_arm.py:60:
    test_mv2_tosa_BI[per_channel_quantization=true]
    test_mv2_tosa_BI[per_channel_quantization=false]

backends/arm/test/models/test_mobilenet_v2_arm.py:78:
    test_mv2_u55_BI[per_channel_quantization=true]
    test_mv2_u55_BI[per_channel_quantization=false]

backends/arm/test/models/test_mobilenet_v2_arm.py:96:
    test_mv2_u85_BI[per_channel_quantization=true]
    test_mv2_u85_BI[per_channel_quantization=false]

"RuntimeError: Expected to not find "torch.ops.higher_order.executorch_call_delegate" but found it"

The CI log also shows multiple TOSA partitioning rejections due to _clone_dim_order not being in BaseTOSASupportList.

unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job

30 failures

backends/arm/test/ops/test_alias_copy.py:63:
    test_alias_tosa_BI[1d_ramp]
    test_alias_tosa_BI[2d_ones]
    test_alias_tosa_BI[3d_rand]
    test_alias_tosa_BI[4d_zeros]

"RuntimeError: Expected out tensor to have dtype signed char, but got float instead"

backends/arm/test/ops/test_clone.py:58:
    test_clone_tosa_MI[ones_1D_10]
    test_clone_tosa_MI[ones_1D_50]
    test_clone_tosa_MI[rand_1D_20]
    test_clone_tosa_MI[rand_2D_10x10]
    test_clone_tosa_MI[rand_3D_5x5x5]
    test_clone_tosa_MI[rand_4D_2x3x4x5]
    test_clone_tosa_MI[large_tensor]

backends/arm/test/ops/test_clone.py:69:
    test_clone_tosa_BI[ones_1D_10]
    test_clone_tosa_BI[ones_1D_50]
    test_clone_tosa_BI[rand_1D_20]
    test_clone_tosa_BI[rand_2D_10x10]
    test_clone_tosa_BI[rand_3D_5x5x5]
    test_clone_tosa_BI[rand_4D_2x3x4x5]
    test_clone_tosa_BI[large_tensor]

backends/arm/test/passes/test_remove_clone_pass.py:43:
    test_remove_clone_tosa_BI

"RuntimeError: Expected to find "torch.ops.higher_order.executorch_call_delegate" but did not find it"

backends/arm/test/ops/test_multihead_attention.py:48:
    test_multihead_attention_tosa_MI[rand_2d]
    test_multihead_attention_tosa_MI[randn_2d]
    test_multihead_attention_tosa_MI[randn_3d]

backends/arm/test/ops/test_multihead_attention.py:65:
    test_multihead_attention_tosa_BI[rand_2d]
    test_multihead_attention_tosa_BI[randn_2d]
    test_multihead_attention_tosa_BI[randn_3d]

backends/arm/test/ops/test_multihead_attention.py:108:
    test_multihead_attention_u85_BI[rand_2d]
    test_multihead_attention_u85_BI[randn_2d]
    test_multihead_attention_u85_BI[randn_3d]

backends/arm/test/ops/test_repeat.py:74:
    test_repeat_tosa_MI[interleave_int_3_x_1]

"RuntimeError: Expected to not find "torch.ops.higher_order.executorch_call_delegate" but found it"

backends/arm/test/ops/test_repeat.py:86:
    test_repeat_tosa_BI[interleave_int_3_x_1]

"RuntimeError: Expected out tensor to have dtype signed char, but got float instead"

keyprocedure · 2025-08-04T21:19:22Z

I think we should land the kernel PR before addressing any new failures from the latest CI run in this PR since this PR is dependent on the kernel existing for _clone_dim_order.

Gasoonjia · 2025-08-04T23:19:51Z

sure let's land prs one by one

### Summary This is PR 1 of 3 implementing a dim order aware clone op. Currently, clone ops are removed during export as no-ops, causing memory layout (dim order) changes to be lost. This can cause backend failures, incorrect outputs when ops expect specific layouts, and performance degradation. This set of PRs introduces a dim order aware clone op, `_clone_dim_order`, which preserves memory layout changes by explicitly storing dim order information. This is implemented by replacing standard clone ops with this variant during export and updating the clone removal transform to preserve clones that change layout. This PR adds the portable CPU kernel for the `_clone_dim_order` op, implementing a clone variant that preserves dim order at runtime. The portable kernel validates dtype and layout compatibility, resizes the output tensor if needed, and performs an element wise clone of the tensors. Note: A future PR will add the ATen kernel for `_clone_dim_order`. Related PRs: - PR 2: [#12971](#12971) - Register `_clone_dim_order` op and map `aten.clone` - PR 3: [#12976](#12976) - Update RemoveCloneOpsTransform to be dim_order aware Fixes #12645 ### Test plan Added kernel runtime tests to verify: - Tensors of all real dtypes are cloned correctly. - Failure when input and output tensor shapes mismatch. - Failure with unsupported memory formats. - Failure when `non_blocking=true` since the portable kernel only supports blocking data transfer. - Dynamic shape outputs are cloned with correct values. - Layout conversions are cloned correctly for `contiguous` to `channels_last`, `channels_last` to `contiguous`, and `channels_last` is preserved. All runtime tests pass via: `build-ninja/kernels/test/portable_kernels_test` --------- Co-authored-by: Gasoonjia <gasoonjia@meta.com>

keyprocedure · 2025-08-16T02:40:03Z

@digantdesai it looks like most of the CI failures were due to _clone_dim_order not being fully registered in TOSA. I had missed including clone_dim_order_support in operator_support/init.py. I've added that registration and also registered a node visitor for _clone_dim_order. Does it make sense to lower the identity since only contiguous dim orders are supported for now?

Gasoonjia

LGTM, though there're some Arm question maybe we need to wait for @digantdesai for final decision.

Gasoonjia · 2025-08-19T06:24:29Z

backends/arm/_passes/remove_clone_pass.py

@@ -14,7 +14,11 @@ class RemoveClonePass(ExportPass):
    """Remove all clones from graph_module"""

    def call_operator(self, op, args, kwargs, meta):
-        if op != exir_ops.edge.aten.clone.default:
+        clone_ops = (
+            exir_ops.edge.aten.clone.default,


we should be removing this when producing the edge dialect graph, no

aten.clone will be replaced only when _skip_dim_order = False. If we do not force Arm turning on dim order op i would love to keep @keyprocedure's work.

Gasoonjia · 2025-08-19T06:30:00Z

backends/arm/test/misc/test_partition_decomposed_quantized_ops.py

@@ -38,7 +38,7 @@
 ]
 linear_residual_exir_op: list[str] = [
    "executorch_exir_dialects_edge__ops_aten_gelu_default",
-    "executorch_exir_dialects_edge__ops_aten_clone_default",
+    "executorch_exir_dialects_edge__ops_dim_order_ops__clone_dim_order_default",


Removing aten.clone here because the corresponding tests is dim order only?
@digantdesai will Arm use dim order mandatorily? If we want to continue support both aten.clone and dim_order.clone, lets keep the original test while have a new test for dim_order case.
Same as other tests

Since ARM's pass manager doesn't seem to pass the _skip_dim_order flag, it defaults to False, so my strategy was to replace all instances of aten.clone with _clone_dim_order in the tests. Although it might make more sense to explicitly set _skip_dim_order=True in the EdgeCompileConfig calls in arm_tester and undo my changes to the ARM test files. Then I can either continue with adding _clone_dim_order support and tests (following Gasoonjia’s suggestion) in TOSA for future use or leave it out for now.

yeah my point is we should make our update in line with Arm's target in a consistent way.

That is, if we expect dim-order-only in Arm backend, there should be no aten.clone in our PR and we should remove aten.clone supports.

However if we should support dim_order equals to both on and off, we may test both case for clone operator, though I'm ok to split it into several PRs for better structure.

That makes sense, thanks for breaking down the options.

FYI see - 135e875

Also cc @oscarandersson8218 for what if no aten.clone support.

@digantdesai thanks for looking into this and sharing the enable dim_order PR. It looks like the best strategy is to replace support for aten.clone with _clone_dim_order in ARM since the _skip_dim_order flag will always be False, following @Gasoonjia's suggestion.

digantdesai · 2025-08-20T10:27:55Z

backends/apple/coreml/compiler/torch_ops.py

@@ -67,6 +68,28 @@ def _to_dim_order_copy(context, node):
    to(context, node)


+@register_torch_op(


FYI @metascroy

This change looks OK to me, but do we have a test that covers it?

I added a test for _clone_dim_order here. Is this what you had in mind?

oscarandersson8218

@keyprocedure Nice change and thanks for updating our backend!

oscarandersson8218 · 2025-08-20T10:51:17Z

backends/arm/operators/op_clone_dim_order.py

+
+
+@register_node_visitor
+class CloneDimOrderVisitor(NodeVisitor):


If RemoveClonePass works as it should there should be no need for a NodeVisitor as all clones should be removed when we end up here. Also, this node_visitor is not imported in __init__.py and is therefore not registered.

Thanks for pointing this out, I'll remove this NodeVisitor. Just to confirm, if a clone changes dim_order, should the RemoveClonePass still be expected to remove it?

I think the check in clone_dim_order_support.py check will make sure that we don't partition a clone that changes dim_order so it should be ok to me.

To expand on this, the plan for dim_order handling in the arm backend is to ignore all delegated dim_order ops (no matter if it changes dim_order or not) and just make sure that the input/output of the delegated sub-graph graph is the correct, since we do our own dim_order handling in https://github.com/pytorch/executorch/blob/main/backends/arm/_passes/annotate_channels_last_dim_order_pass.py. The dtype casting of the to_dim_order_copy is kept though.

We will not delegate partitions containing single dim_order ops to allow the use of dim_order ops in non delegated sub-graphs, similar to this PR: #12995

I really appreciate the context and resources on how dim_order ops are handled. This also helps with diagnosing the graph delegation CI failures.

Gasoonjia · 2025-08-22T19:24:28Z

hi @keyprocedure looks like there're still several issues on ci. Do you mind take a look?

keyprocedure · 2025-08-22T19:26:21Z

hi @keyprocedure looks like there're still several issues on ci. Do you mind take a look?

Hi, yeah, currently working on them. Appreciate the check in :)

keyprocedure · 2025-08-25T16:18:30Z

Hi @Gasoonjia, can we try running CI again?

I was able to configure the TOSA test setup locally and most of the test failures were due to the input dtype gating that I had in CloneDimOrderSupport, which caused clone ops to not be partitioned, remaining in the subgraph, and failing the tester.check_not tests.

Gasoonjia · 2025-08-25T17:35:18Z

@keyprocedure Really appreciate your work mate! Started ci and let's see hwo things work!

keyprocedure · 2025-08-25T19:22:00Z

@Gasoonjia it looks like we're good!

The 3 CI failures seem unrelated:

build doc (buck2) failed because of an upload error
test-static-llama-qnn-linux had a Failed to download LLVM error. This job passed last run and I only made changes to ARM since then.
test-llama_runner_eager-linux failed without a log (GitHub suggests: FLAKY - The following job failed but was likely due to flakiness present on trunk)

metascroy · 2025-08-25T22:07:34Z

backends/apple/coreml/compiler/torch_ops.py

+    ), "Only contiguous memory format is supported in CoreML"
+
+    # Since CoreML only supports contiguous format, no dim_order preservation is needed. Treat this as a no-op clone.
+    noop(context, node)


@keyprocedure Why not just call clone(context, node) here? It will be equivalent to noop now, but let's use clone if that's what the op is.

Just in case coremltools updates clone to be something other than a noop in future?

That makes a lot of sense! But I don't see clone exposed in coremltools for import, it looks like it’s only handled as an alias to noop.

I guess this is OK then

Gasoonjia · 2025-08-25T23:02:40Z

ci gogogo!

Gasoonjia · 2025-08-26T01:57:15Z

ci passed. PR LGTM. Thanks for your contribution! @keyprocedure

@digantdesai @oscarandersson8218 @metascroy please let me know if you have any concerns, or I will stamp it.

### Summary This is PR 1 of 3 implementing a dim order aware clone op. Currently, clone ops are removed during export as no-ops, causing memory layout (dim order) changes to be lost. This can cause backend failures, incorrect outputs when ops expect specific layouts, and performance degradation. This set of PRs introduces a dim order aware clone op, `_clone_dim_order`, which preserves memory layout changes by explicitly storing dim order information. This is implemented by replacing standard clone ops with this variant during export and updating the clone removal transform to preserve clones that change layout. This PR adds the portable CPU kernel for the `_clone_dim_order` op, implementing a clone variant that preserves dim order at runtime. The portable kernel validates dtype and layout compatibility, resizes the output tensor if needed, and performs an element wise clone of the tensors. Note: A future PR will add the ATen kernel for `_clone_dim_order`. Related PRs: - PR 2: [pytorch#12971](pytorch#12971) - Register `_clone_dim_order` op and map `aten.clone` - PR 3: [pytorch#12976](pytorch#12976) - Update RemoveCloneOpsTransform to be dim_order aware Fixes pytorch#12645 ### Test plan Added kernel runtime tests to verify: - Tensors of all real dtypes are cloned correctly. - Failure when input and output tensor shapes mismatch. - Failure with unsupported memory formats. - Failure when `non_blocking=true` since the portable kernel only supports blocking data transfer. - Dynamic shape outputs are cloned with correct values. - Layout conversions are cloned correctly for `contiguous` to `channels_last`, `channels_last` to `contiguous`, and `channels_last` is preserved. All runtime tests pass via: `build-ninja/kernels/test/portable_kernels_test` --------- Co-authored-by: Gasoonjia <gasoonjia@meta.com>

Gasoonjia · 2025-08-26T21:48:00Z

Merged! @keyprocedure

keyprocedure · 2025-08-26T23:18:31Z

Appreciate all the feedback and reviews, glad we got this landed!

This reverts commit 130cafc.

jackzhxng · 2025-08-27T05:07:20Z

Sorry to be the bearer of bad news, but this broke some trunk tests. Let's revert and @keyprocedure can you try to get this in again with the ciflow/trunk label turned on?

Gasoonjia · 2025-08-27T05:12:39Z

@jackzhxng stamped the revert PR.
btw why do we need to manually set the tag for ci test, instead of make it as default?
is there any place we can find all tags that should be tagged for enabling all cis?

Reverts #12971

keyprocedure · 2025-08-27T20:23:20Z

Thanks for the update @jackzhxng, I'll resubmit the PR with the additional label.

digantdesai · 2025-08-29T20:46:36Z

btw why do we need to manually set the tag for ci test, instead of make it as default?

money 🤑

keyprocedure added 2 commits July 29, 2025 10:49

Register clone_dim_order op; add test for op replacement

877119f

Rename clone_dim_order op registration test

f75845d

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 29, 2025

Gasoonjia reviewed Jul 29, 2025

View reviewed changes

Merge branch 'main' into add-dim-order-clone-aot

83d8c75

pytorch-bot bot added the release notes: none Do not include this in the release notes label Jul 29, 2025

This was referenced Jul 30, 2025

Add _clone_dim_order portable kernel #12974

Merged

[EXIR] Update RemoveCloneOpsTransform to be dim order aware #12976

Open

keyprocedure added 2 commits August 2, 2025 20:04

Add graph level and end to end tests for _clone_dim_order op

cff39c9

Remove _clone_dim_order op registration (moved to PR pytorch#12974)

1fe461f

keyprocedure changed the title ~~[EXIR] Add dim order aware clone operator~~ [EXIR] Map clone to _clone_dim_order Aug 3, 2025

Register _clone_dim_order op

95db027

keyprocedure changed the title ~~[EXIR] Map clone to _clone_dim_order~~ [EXIR] Register _clone_dim_order op and map aten.clone to it Aug 6, 2025

keyprocedure changed the title ~~[EXIR] Register _clone_dim_order op and map aten.clone to it~~ [EXIR] Register _clone_dim_order op and map aten.clone Aug 6, 2025

Merge branch 'main' into add-dim-order-clone-aot

63d45e7

keyprocedure marked this pull request as ready for review August 11, 2025 20:33

Gasoonjia reviewed Aug 19, 2025

View reviewed changes

Merge branch 'main' into add-dim-order-clone-aot

fe7dd11

digantdesai reviewed Aug 20, 2025

View reviewed changes

oscarandersson8218 reviewed Aug 20, 2025

View reviewed changes

keyprocedure added 3 commits August 25, 2025 09:15

Remove visitor node registration for _clone_dim_order

7a0bc6a

Remove aten.clone check from RemoveClonePass

74e2cce

Remove input dtype gating and add memory_format check

f9f9515

Merge branch 'main' into add-dim-order-clone-aot

8d0cb06

metascroy reviewed Aug 25, 2025

View reviewed changes

Add Core ML test for _clone_dim_order

6839212

Gasoonjia approved these changes Aug 26, 2025

View reviewed changes

Gasoonjia merged commit 130cafc into pytorch:main Aug 26, 2025
117 of 118 checks passed

jackzhxng added a commit that referenced this pull request Aug 27, 2025

Revert "[EXIR] Register _clone_dim_order op and map aten.clone (#12971)"

b79289c

This reverts commit 130cafc.

jackzhxng mentioned this pull request Aug 27, 2025

Revert "[EXIR] Register _clone_dim_order op and map aten.clone" #13723

Merged

jackzhxng added a commit that referenced this pull request Aug 27, 2025

Revert "[EXIR] Register _clone_dim_order op and map aten.clone" (#13723)

3f81e81

Reverts #12971

		@@ -67,6 +68,28 @@ def _to_dim_order_copy(context, node):
		to(context, node)


		@register_torch_op(



		@register_node_visitor
		class CloneDimOrderVisitor(NodeVisitor):

[EXIR] Register _clone_dim_order op and map aten.clone #12971

[EXIR] Register _clone_dim_order op and map aten.clone #12971

Uh oh!

Conversation

keyprocedure commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12971

❗ 1 Active SEVs

✅ No Failures

Uh oh!

Gasoonjia left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyprocedure commented Jul 29, 2025

Uh oh!

keyprocedure commented Jul 29, 2025

Uh oh!

Gasoonjia commented Jul 31, 2025

Uh oh!

keyprocedure commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gasoonjia commented Jul 31, 2025

Uh oh!

keyprocedure commented Jul 31, 2025

Uh oh!

Gasoonjia commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keyprocedure commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gasoonjia commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keyprocedure commented Aug 4, 2025

unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job

unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job

Uh oh!

keyprocedure commented Aug 4, 2025

Uh oh!

Gasoonjia commented Aug 4, 2025

Uh oh!

keyprocedure commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gasoonjia left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyprocedure Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

digantdesai Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

keyprocedure commented Jul 29, 2025 •

edited

Loading

pytorch-bot bot commented Jul 29, 2025 •

edited

Loading

keyprocedure commented Jul 31, 2025 •

edited

Loading

Gasoonjia commented Jul 31, 2025 •

edited

Loading

keyprocedure commented Aug 4, 2025 •

edited

Loading

Gasoonjia commented Aug 4, 2025 •

edited

Loading

keyprocedure commented Aug 16, 2025 •

edited

Loading

keyprocedure Aug 19, 2025 •

edited

Loading

digantdesai Aug 20, 2025 •

edited

Loading

keyprocedure commented Aug 25, 2025 •

edited

Loading

metascroy Aug 25, 2025 •

edited

Loading

Gasoonjia commented Aug 26, 2025 •

edited

Loading