[TOP][ARM] Winograd nnpack backend #2721

hlu1 · 2019-03-04T04:54:22Z

Discussions #2692
@merrymercy, @ajtulloch, @yidawang please review.

yidawang

In general LGTM. Two questions in addition to the inline comments:

Do you plan to register the ops to Relay as well?
Why do you handle conv2d_winograd_nnpack_weight_transform and conv2d_winograd_nnpack_without_weight_transform differently in the code?

yidawang · 2019-03-04T05:27:03Z

topi/python/topi/arm_cpu/conv2d.py

@@ -522,6 +595,58 @@ def _callback(op):
    return s


+##### REGISTER TOPI COMPUTE / SCHEDULE FOR WINOGRAD NNPACK WITH WEIGHT TRANSFORM #####


yidawang · 2019-03-04T06:30:47Z

topi/python/topi/arm_cpu/conv2d.py

+            algorithm=cfg['winograd_nnpack_algorithm'].val)
+
+    # we have to manually assign effective GFLOP for winograd
+    cfg.add_flop(2 * N * CI * H * W * KH * KW * CO)


Out of curiosity, is it because nnpack needs this info to decide if winograd should kick in?

There are two reasons:

NNPACK is an extern op, compute_flops does not know how to compute the flops for extern ops

The winograd implementation reduces the algorithmic complexity by reducing the number of multiplication operators, so here we care about the "fake GFlops" (the number of multiplys as if we did a direct conv) so later AutoTVM can choose the best out of direct, winograd, winograd_nnpack.

hlu1 · 2019-03-04T08:04:12Z

To answer your question:

Yes. And I would like to get some early feedbacks on the topi/nnpack implementation first.
We need to separate the weight transform out into another node so that it can be precomputed as part of the pregraph (the PrecomputePrune pass in nnvm). This technique is used by the TVM winograd implementation as well (https://github.com/dmlc/tvm/blob/master/topi/python/topi/arm_cpu/conv2d.py#L503).

hlu1 · 2019-03-05T06:05:17Z

@tqchen and @merrymercy, I noticed that in relay, bias is no longer part of the conv op. However, in NNPACK bias is a required input for conv. What's the easiest way to map conv + bias_add in relay to conv in NNPACK? The most obvious way is probably to pass zero_bias to nnpack (less preferred because it incurs perf overhead), I'm wondering if there's a cleaner way to do this.

FrozenGene · 2019-03-07T08:48:06Z

Maybe you have to pass zero_bias to it. I can not think of simple and clean way too, because they are two isolated ops. I am also interested in knowing some better way too.

tqchen · 2019-03-09T21:12:06Z

@hlu1 we could introduce another experimental op that contains bias, and mark it as internal (level=10), then allow pattern matching to the new one. Alternatively, we can start with zero bias

hlu1 · 2019-03-11T05:41:54Z

@tqchen, can you elaborate a bit more on the first approach? What do you mean by allowing pattern matching to the new one?

tqchen · 2019-03-11T18:53:06Z

Yes, what I mean is to transform conv->bias_add => ConvWithBias

hlu1 · 2019-03-12T01:41:39Z

I fixed the bias in relay by passing None to topi.nn.conv2d_winograd_nnpack_without_weight_transform ( https://github.com/dmlc/tvm/pull/2721/files#diff-b8a5824a5a8f1907448d2588d8cd773cR326)

ajtulloch

This needs some kind of unit test coverage right? That should definitely be added. Otherwise looks good to me, some minor style nits/Relay questions inlined.

ajtulloch · 2019-03-12T04:22:14Z

include/tvm/relay/attrs/nn.h

+                    "relay.attrs.Conv2DWinogradNNPACKWeightTransformAttrs") {
+    TVM_ATTR_FIELD(convolution_algorithm)
+        .describe(
+            "The convolution algorithm for Winograd NNPACK. E.g. 3 for WT_8x8, "


Could you explicitly reference the enum defined in tvm.contrib.nnpack.ConvolutionAlgorithm?

Yes, will do

ajtulloch · 2019-03-12T04:22:36Z

nnvm/include/nnvm/top/nn.h

+  DMLC_DECLARE_PARAMETER(WinogradNNPACKWeightTransformParam) {
+    DMLC_DECLARE_FIELD(convolution_algorithm)
+        .describe(
+            "The convolution algorithm for Winograd NNPACK. E.g. 3 for WT_8x8, "


Ditto re: tvm.contrib.nnpack.ConvolutionAlgorithm.

ajtulloch · 2019-03-12T04:23:27Z

src/relay/op/nn/convolution.cc

@@ -10,6 +10,7 @@
 #include "../../pass/alter_op_layout.h"
 #include "../layout.h"

+


ajtulloch · 2019-03-12T04:24:02Z

src/relay/op/nn/convolution.cc

+- **out**:  Output is 4D array of shape (batch_size, channels, out_height, out_width)
+)code" TVM_ADD_FILELINE)
+.set_attrs_type_key("relay.attrs.Conv2DAttrs")
+.set_num_inputs(2)


Curious, why doesn't this also have an optional bias input?

Because the conv2d op in relay does not have bias.

ajtulloch · 2019-03-12T04:26:09Z

topi/python/topi/arm_cpu/injective.py

@@ -0,0 +1,34 @@
+# pylint: disable=invalid-name, unused-variable


Were these intended to be in this diff? They seem useful but should be in a separate PR right?

ajtulloch · 2019-03-12T04:26:20Z

topi/python/topi/arm_cpu/pooling.py

@@ -0,0 +1,45 @@
+# pylint: disable=invalid-name, unused-variable
+"""Schedule for pooling operators"""


Ditto re: a separate PR?

hlu1

I'll add unit tests ASAP

hlu1 · 2019-03-12T05:45:23Z

include/tvm/relay/attrs/nn.h

+                    "relay.attrs.Conv2DWinogradNNPACKWeightTransformAttrs") {
+    TVM_ATTR_FIELD(convolution_algorithm)
+        .describe(
+            "The convolution algorithm for Winograd NNPACK. E.g. 3 for WT_8x8, "


Yes, will do

hlu1 · 2019-03-12T05:45:52Z

nnvm/include/nnvm/top/nn.h

+  DMLC_DECLARE_PARAMETER(WinogradNNPACKWeightTransformParam) {
+    DMLC_DECLARE_FIELD(convolution_algorithm)
+        .describe(
+            "The convolution algorithm for Winograd NNPACK. E.g. 3 for WT_8x8, "


hlu1 · 2019-03-12T05:46:12Z

src/relay/op/nn/convolution.cc

@@ -10,6 +10,7 @@
 #include "../../pass/alter_op_layout.h"
 #include "../layout.h"

+


hlu1 · 2019-03-12T05:47:19Z

src/relay/op/nn/convolution.cc

+- **out**:  Output is 4D array of shape (batch_size, channels, out_height, out_width)
+)code" TVM_ADD_FILELINE)
+.set_attrs_type_key("relay.attrs.Conv2DAttrs")
+.set_num_inputs(2)


Because the conv2d op in relay does not have bias.

hlu1 · 2019-03-12T05:47:32Z

topi/python/topi/arm_cpu/injective.py

@@ -0,0 +1,34 @@
+# pylint: disable=invalid-name, unused-variable


hlu1 · 2019-03-12T05:47:37Z

topi/python/topi/arm_cpu/pooling.py

@@ -0,0 +1,45 @@
+# pylint: disable=invalid-name, unused-variable
+"""Schedule for pooling operators"""


hlu1 · 2019-03-13T01:23:43Z

@ajtulloch, I removed injective.py and pooling.py from this PR and added unet tests for both nnvm and relay

hlu1 · 2019-03-13T04:23:19Z

@tqchen, is tests/python/integration/test_winograd_nnpack.py a good place for this test? It's not exactly for testing frontend, that's why I didn't put it under tests/python/frontend/mxnet/

tqchen · 2019-03-24T17:39:07Z

@yidawang @ajtulloch pleas another look and https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly

ajtulloch

Looks great, thanks so much. I approve this change, just a minor fix about some comments.

python/tvm/relay/op/nn/nn.py

src/relay/op/nn/convolution.cc

hlu1 · 2019-03-26T01:06:10Z

@tqchen, all comments have been addressed. It should be good to go.

yidawang

Thanks for the work!

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer save save save save save

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer save save save save save remove remove

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer save save save save save remove remove save

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer save save save save save remove remove save save

lint lint save save add more case save error lint lint commit do lint save fix lint wrap it back as func lint save remove dead comment fix style fix lint Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> address review feedback pe now handle freevar. as a result preserving function is now trivial. test add basic test, implement pretty printing for generic function test lint fix segfault save save do test fix another error address comment commit save address review feedback add test for invalidate, fix error in lookup rename cont to boduy fix error and add regression test Update src/relay/pass/partial_eval.cc Co-Authored-By: MarisaKirisame <lolisa@marisa.moe> fix error, add test case fix lint remove extra line fix some error pe commit save save save save save (pe/dce broken) [DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879) [Relay][Text Format] Reverse CallNode Print Order (apache#2882) [NNPACK] Modernize test (apache#2868) [Relay] Add list update to prelude (apache#2866) Add missing sgx includes (apache#2878) Fix setting up hints for getaddrinfo (apache#2872) [ARITH] RewriteSimplifier: improved cmp simplification (apache#2851) do (apache#2883) [RELAY][Frontend][TF] decompile tf control flow (apache#2830) * decompile tf control flow * Add docs * remove import relay * move tests under tensorflow frontend * minor fix Enhance upsample operator to adapt onnx opset version 9 (apache#2840) Use version invariant rustfmt (apache#2886) [Relay][Op] Add group conv2d dispatch to topi function (apache#2870) * [Relay][Op] Add group conv2d dispatch to topi function * Rerun tests [Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888) fix prelu, now can use on 2d input and add one test (apache#2875) Add dense schedules to __init__ for cpu (apache#2855) * Add dense schedules to __init__ for cpu * Add documentation for topi::shape * Add additional imports to topi CPU __init__. [TESTS] Improve script robustness (apache#2893) A number of test scripts use the '|| exit 1' idiom. This has two issues, first process exit codes are defined to be in the range 0-255. Second, more importantly, the idiom is fragile because it requires that every possible failure point be explicitly coded. This patch removes the idiom in favour of "set -e" as used in the docker scripts as a more robust mechanism to ensure that script failures are always caught and propagated by default. [Relay] Fix name of bias in testing.mlp (apache#2892) winograd_nnpack (apache#2721) [Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861) * Fix Relay ARM CPU spatial pack depthwise alter op layout issue. * Update tune_relay_arm.py [TESTS] Import script robustness (set -u) (apache#2896) Adopt the "set -u" idiom from the docker scripts as a mechanism to improve future robustness. [DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901) Allow linking against MKLML (apache#2902) [COMMUNITY] ASF mentors (apache#2906) [Relay] Allow converting keras.layers.Sequential (apache#2842) * Allow converting keras.layers.Sequential * Use existing new_var function * Only update expr when missing * Add test [Relay] clean up hd, change tl (apache#2917) Turn on USE_SORT by default (apache#2916) [TEST] Cache test data (apache#2921) Unified error handling in NNVM and Relay frontends (apache#2828) add support for mxnet smooth_l1 (apache#2905) [Relay] Add support for TupleGetItem in op fusion (apache#2914) [Relay, TOPI] Deformable conv2d (apache#2908) * [Relay, TOPI] Add deformable conv2d * Moved to op level2 * Fix lint * Moved to level2 & bug fix * Update comments * Disabled flaky test of conv2d TVM debugresult dump to Chrome Tracing (apache#2922) [Relay] add test for second order ad (apache#2754) * do second order * add comment * better name * use tvm assert all close * refire ci Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926) This reverts commit f5ca991. [Tutorial] Cache the test data in tutorial (apache#2923) [AUTOTVM] Refactor measure build func (apache#2927) Fix intersect of modular set (apache#2904) Fix comment bugs and code style [Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929) Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860) [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850) * [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. * * test cases * * ci error Outdated renaming for flatten in ONNX converter (apache#2843) [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864) * [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. * * review comments Fix vcvtph2ps codegen (apache#2925) Port changes More fixes save save Changes to schedules and mxnet importer save save save save save remove remove save save revert

hlu1 mentioned this pull request Mar 4, 2019

[RFC] Winograd NNPACK for the ARM_CPU backend #2692

Closed

merrymercy self-assigned this Mar 4, 2019

yidawang reviewed Mar 4, 2019

View reviewed changes

hlu1 force-pushed the winograd-nnpack-ARM branch from d922908 to 9e38ac9 Compare March 4, 2019 08:09

hlu1 changed the title ~~[arm_cpu] Winograd nnpack backend~~ [TOP][ARM] Winograd nnpack backend Mar 5, 2019

tqchen added status: need test case need test cases to cover the change status: need review and removed status: need test case need test cases to cover the change labels Mar 9, 2019

hlu1 force-pushed the winograd-nnpack-ARM branch 2 times, most recently from ce8a38f to 80f72d7 Compare March 11, 2019 02:55

hlu1 force-pushed the winograd-nnpack-ARM branch from 80f72d7 to 2c63b31 Compare March 12, 2019 01:32

ajtulloch suggested changes Mar 12, 2019

View reviewed changes

hlu1 commented Mar 12, 2019

View reviewed changes

hlu1 force-pushed the winograd-nnpack-ARM branch 2 times, most recently from 3f9924d to 4715625 Compare March 13, 2019 01:22

hlu1 force-pushed the winograd-nnpack-ARM branch 4 times, most recently from 94aec73 to 759caae Compare March 13, 2019 03:43

hlu1 force-pushed the winograd-nnpack-ARM branch 2 times, most recently from 614b25d to 447378f Compare March 13, 2019 20:24

hlu1 force-pushed the winograd-nnpack-ARM branch 6 times, most recently from 4de22c5 to 7b2a6df Compare March 23, 2019 21:54

ajtulloch approved these changes Mar 25, 2019

View reviewed changes

python/tvm/relay/op/nn/nn.py Outdated Show resolved Hide resolved

src/relay/op/nn/convolution.cc Outdated Show resolved Hide resolved

winograd_nnpack

84a1394

hlu1 force-pushed the winograd-nnpack-ARM branch from 7b2a6df to 84a1394 Compare March 26, 2019 01:03

yidawang approved these changes Mar 26, 2019

View reviewed changes

tqchen merged commit a02916b into apache:master Mar 26, 2019

tqchen added status: accepted and removed status: need review labels Mar 26, 2019

Laurawly pushed a commit to Laurawly/tvm-1 that referenced this pull request Mar 27, 2019

winograd_nnpack (apache#2721)

b45b68d

wweic pushed a commit to wweic/tvm that referenced this pull request Mar 29, 2019

winograd_nnpack (apache#2721)

9bb8900

wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 29, 2019

winograd_nnpack (apache#2721)

297ad34

MarisaKirisame pushed a commit to MarisaKirisame/tvm that referenced this pull request Apr 9, 2019

winograd_nnpack (apache#2721)

7c69629

MarisaKirisame mentioned this pull request Apr 9, 2019

Track MarisaKirisame/tvm#3

Closed

hlu1 deleted the winograd-nnpack-ARM branch April 17, 2019 06:53

tqchen unassigned merrymercy Nov 4, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

		@@ -522,6 +595,58 @@ def _callback(op):
		return s


		##### REGISTER TOPI COMPUTE / SCHEDULE FOR WINOGRAD NNPACK WITH WEIGHT TRANSFORM #####

		@@ -10,6 +10,7 @@
		#include "../../pass/alter_op_layout.h"
		#include "../layout.h"

		@@ -0,0 +1,34 @@
		# pylint: disable=invalid-name, unused-variable

		@@ -0,0 +1,45 @@
		# pylint: disable=invalid-name, unused-variable
		"""Schedule for pooling operators"""

[TOP][ARM] Winograd nnpack backend #2721

[TOP][ARM] Winograd nnpack backend #2721

Conversation

hlu1 commented Mar 4, 2019

yidawang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hlu1 commented Mar 4, 2019

hlu1 commented Mar 5, 2019

FrozenGene commented Mar 7, 2019

tqchen commented Mar 9, 2019

hlu1 commented Mar 11, 2019

tqchen commented Mar 11, 2019

hlu1 commented Mar 12, 2019

ajtulloch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hlu1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hlu1 commented Mar 13, 2019

hlu1 commented Mar 13, 2019

tqchen commented Mar 24, 2019

ajtulloch left a comment

Choose a reason for hiding this comment

hlu1 commented Mar 26, 2019

yidawang left a comment

Choose a reason for hiding this comment