[MetaSchedule] Fix the order of applying `AutoInline` in `ScheduleUsingAnchorTrace` #13329

masahi · 2022-11-09T08:42:49Z

Note: the diff is bloated due to the test case.

In anchor-block tuning, we need to manually apply AutoInline to some blocks (those that are not part of the anchor subgraph). Currently the order of blocks to apply AutoInline is undefined, but I've hit a case where this is problematic.

For example, given these four blocks,

        for i0_7, i1_7, i2_7, i3_7 in T.grid(16, 56, 56, 256):
            with T.block("compute_2"):
                i0_8, i1_8, i2_8, i3_8 = T.axis.remap("SSSS", [i0_7, i1_7, i2_7, i3_7])
                T.reads(T_subtract_1[i0_8, i1_8, i2_8, i3_8])
                T.writes(compute_3[i0_8, i1_8, i2_8, i3_8])
                compute_3[i0_8, i1_8, i2_8, i3_8] = T.q_multiply_shift(T_subtract_1[i0_8, i1_8, i2_8, i3_8], 1457846997, 31, 0, dtype="int32")
        for i0_9, i1_9, i2_9, i3_9 in T.grid(16, 56, 56, 256):
            with T.block("compute_3"):
                i0_10, i1_10, i2_10, i3_10 = T.axis.remap("SSSS", [i0_9, i1_9, i2_9, i3_9])
                T.reads(p9[i0_10, i1_10, i2_10, i3_10])
                T.writes(compute_4[i0_10, i1_10, i2_10, i3_10])
                compute_4[i0_10, i1_10, i2_10, i3_10] = T.q_multiply_shift(p9[i0_10, i1_10, i2_10, i3_10], 2101000910, 31, 0, dtype="int32")
        for i0_11, i1_11, i2_11, i3_11 in T.grid(16, 56, 56, 256):
            with T.block("T_add_2"):
                ax0, ax1, ax2, ax3 = T.axis.remap("SSSS", [i0_11, i1_11, i2_11, i3_11])
                T.reads(compute_3[ax0, ax1, ax2, ax3], compute_4[ax0, ax1, ax2, ax3])
                T.writes(T_add_2[ax0, ax1, ax2, ax3])
                T_add_2[ax0, ax1, ax2, ax3] = compute_3[ax0, ax1, ax2, ax3] + compute_4[ax0, ax1, ax2, ax3]
        for i0_12, i1_12, i2_12, i3_12 in T.grid(16, 56, 56, 256):
            with T.block("compute_4"):
                i0_13, i1_13, i2_13, i3_13 = T.axis.remap("SSSS", [i0_12, i1_12, i2_12, i3_12])
                T.reads(T_add_2[i0_13, i1_13, i2_13, i3_13])
                T.writes(compute[i0_13, i1_13, i2_13, i3_13])
                compute[i0_13, i1_13, i2_13, i3_13] = T.max(T.min(T_add_2[i0_13, i1_13, i2_13, i3_13], 255), 0)

, we want to AutoInline "compute_3", "T_add_2" and "compute_4". If the order is "T_add_2" -> "compute_3" -> "compute_4", all three blocks can be inlined / reverse inlined to "compute_2". However, if the order is "T_add_2" -> "compute_4" -> "compute_3" , "compute_4" can neither be inlined or reverse inlined. This in turn can result in a buggy schedule to be generated (see the description in the test case).

We can avoid this problem by always AutoInlining the last block after all other blocks have been processed. This ensures that the last block can be reverse inlined.

@vinx13 @junrushao @zxybazh

CanComputeInline is false

tvm-bot · 2022-11-09T08:42:52Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Hzfengsy, @elvin-n, @junrushao _{See #10317 for details}
Built docs for commit c64cafa can be found here.

_{Generated by tvm-bot}

masahi · 2022-11-09T08:47:04Z

src/meta_schedule/trace_apply.cc

+            << "If a spatial block cannot be inlined, it should be the output block";
+        if (CanReverseComputeInline(sch->state(), block_sref)) {
+          sch->ReverseComputeInline(block);
+        }


This is another fix, just relaxing the wrong assumption at L144.

masahi · 2022-11-09T08:52:02Z

tests/python/unittest/test_meta_schedule_trace_apply.py

+    # "conv2d_nhwc_reindex_shared" has the predicate
+    # T.where(((ax1_0 * 4 + ax1_1) * 32 + ax1_2) * 2 + ax1_3 < 64) due to anchor-block scheduling
+    # (see Conv2dInt8_with_predicate_scheduled). Currently, if we try to reverse-inline a block to
+    # its producer that has a predicate, the predicate disappears after reverse inlining.


cc @vinx13 to confirm if applying reverse_compute_inline when the producer has a predicate should be disallowed. Currently it is allowed, and the predicate disappears. A minimum repro in https://gist.github.com/masahi/01a80b86062122ad57b9b1fd785fb960

junrushao · 2022-11-09T21:05:47Z

Will leave this PR to @vinx13 :-)

…ngAnchorTrace` (apache#13329) * index on concat-fusion-fix: 3ffe5b1 fix te extern create_prim_func test * Apply AutoInline to the last block after all other blocks are processed * Do not require CanReverseComputeInline to be true when CanComputeInline is false * add comment * add test * cpplint

masahi added 5 commits November 9, 2022 17:31

index on concat-fusion-fix: 3ffe5b1 fix te extern create_prim_func test

19f40e7

Apply AutoInline to the last block after all other blocks are processed

c3c5c84

Do not require CanReverseComputeInline to be true when

4334071

CanComputeInline is false

add comment

674cf82

add test

45e10d4

masahi commented Nov 9, 2022

View reviewed changes

cpplint

c64cafa

junrushao assigned vinx13 Nov 9, 2022

vinx13 approved these changes Nov 9, 2022

View reviewed changes

vinx13 merged commit 8453c9c into apache:main Nov 9, 2022

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Fix the order of applying `AutoInline` in `ScheduleUsingAnchorTrace` #13329

[MetaSchedule] Fix the order of applying `AutoInline` in `ScheduleUsingAnchorTrace` #13329

masahi commented Nov 9, 2022 •

edited

Loading

tvm-bot commented Nov 9, 2022 •

edited

Loading

masahi Nov 9, 2022

masahi Nov 9, 2022

junrushao commented Nov 9, 2022

[MetaSchedule] Fix the order of applying AutoInline in ScheduleUsingAnchorTrace #13329

[MetaSchedule] Fix the order of applying AutoInline in ScheduleUsingAnchorTrace #13329

Conversation

masahi commented Nov 9, 2022 • edited Loading

tvm-bot commented Nov 9, 2022 • edited Loading

masahi Nov 9, 2022

Choose a reason for hiding this comment

masahi Nov 9, 2022

Choose a reason for hiding this comment

junrushao commented Nov 9, 2022

[MetaSchedule] Fix the order of applying `AutoInline` in `ScheduleUsingAnchorTrace` #13329

[MetaSchedule] Fix the order of applying `AutoInline` in `ScheduleUsingAnchorTrace` #13329

masahi commented Nov 9, 2022 •

edited

Loading

tvm-bot commented Nov 9, 2022 •

edited

Loading