[microNPU] Add hardware constraints for binary elementwise #13772

Aleksei-grovety · 2023-01-12T14:27:37Z

Does not fuse min and max operations with requantize if there are different scales as it is not supported on NPU. Since there are hardware constraints, we cannot perform min or max operation fused with requantize (please look at NPU_SET_OFM_SCALE register description https://developer.arm.com/documentation/102420/0200/Programmers-model/Command-stream/cmd1-commands-) when we have different scales.
min/max operations with matching scales are offloaded to NPU as ethosu_binary_elementwise
min/max operations with different scales are offloaded to NPU as ethosu_binary_elementwise + ethosu_identity.

cc @leandron, @ekalda, @lhutton1

tvm-bot · 2023-01-12T14:27:40Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Mousius, @leandron, @lhutton1 _{See #10317 for details}

_{Generated by tvm-bot}

ekalda

Thanks @alexey-yazev, looks good in general, some comments about the legalization tests :)

ekalda · 2023-01-16T13:40:17Z

tests/python/contrib/test_ethosu/test_legalize.py

@@ -881,7 +888,7 @@ def verify(ext_func):
        ([1, 4, 4], [4, 1], False),
    ],
 )
-@pytest.mark.parametrize("activation_function", ["NONE", "RELU"])
+@pytest.mark.parametrize("activation_function", [None, tf.nn.relu, tf.nn.relu6, relu_n1_to_1])


Suggested change

@pytest.mark.parametrize("activation_function", [None, tf.nn.relu, tf.nn.relu6, relu_n1_to_1])

@pytest.mark.parametrize("activation_function", [None, tf.nn.relu])

Since this test looks at the graph entirely at Relay level, where all the RELUs are just Relay clip operations, I don't think we benefit much from extra 70 tests with just different min and max attributes to clip.

I agree with this, I will add separate test for case with MAX operation and relu_n1_to_1 activation.

ekalda · 2023-01-16T13:57:59Z

tests/python/contrib/test_ethosu/test_legalize.py

+                if has_separate_requantize:
+                    # In case when requantize cannot be fused with MIN/MAX + CLIP due to hardware constraints
+                    # there should be default quantization values since requantize is separate operation.
+                    assert float(op.attrs.ifm_scale) == 1.0
+                    assert int(op.attrs.ifm_zero_point) == 0
+                    assert float(op.attrs.ifm2_scale) == 1.0
+                    assert int(op.attrs.ifm2_zero_point) == 0
+                    assert float(op.attrs.ofm_scale) == 1.0
+                    assert int(op.attrs.ofm_zero_point) == 0
+                else:
+                    # MIN and MAX with an activation must have a requantize operation
+                    # baked into the output. To check the extra requantize node was
+                    # picked up by the pattern, we can make sure the quantization
+                    # information is not default.
+                    assert float(op.attrs.ifm_scale) != 1.0
+                    assert int(op.attrs.ifm_zero_point) != 0
+                    assert float(op.attrs.ifm2_scale) != 1.0
+                    assert int(op.attrs.ifm2_zero_point) != 0
+                    assert float(op.attrs.ofm_scale) != 1.0
+                    assert int(op.attrs.ofm_zero_point) != 0


Do both of these blocks get run? It looks like we are using the same method of generating representative dataset (which will determine the qnn params) for all the tests, so I suspect we will always create IFMs with differing qnn params and therefore test only one of the patterns here.

Yes, both of these blocks get run, the first block is run for cases with MAX operation and relu_n1_to_1 activation for example test_tflite_binary_elemwise_legalize[relu_n1_to_1-ifm_shape0-ifm2_shape0-False-MAX]

fn (%tvmgen_default_ethos_u_main_0_ifms: Tensor[(48), int8] /* ty=Tensor[(48), int8] */, Inline=1, Compiler="ethos-u", global_symbol="tvmgen_default_ethos_u_main_0", Primitive=1) -> Tensor[(1, 2, 3, 4), int8] { %0 = split(%tvmgen_default_ethos_u_main_0_ifms, indices_or_sections=[24]) /* ty=(Tensor[(24), int8], Tensor[(24), int8]) */; %1 = %0.0 /* ty=Tensor[(24), int8] */; %2 = %0.1 /* ty=Tensor[(24), int8] */; %3 = reshape(%1, newshape=[1, 2, 3, 4]) /* ty=Tensor[(1, 2, 3, 4), int8] */; %4 = reshape(%2, newshape=[1, 2, 3, 4]) /* ty=Tensor[(1, 2, 3, 4), int8] */; %5 = contrib.ethosu.binary_elementwise(%3, %4, meta[relay.Constant][0] /* ty=Tensor[(0), int8] */, operator_type="MAX", ifm_scale=1f, ifm_zero_point=0, ifm2_scale=1f, ifm2_zero_point=0, ofm_scale=1f, ofm_zero_point=0, ifm_channels=4, ifm2_channels=4, activation="CLIP", clip_min=-128, ofm_dtype="int8") /* ty=Tensor[(1, 2, 3, 4), int8] */; contrib.ethosu.identity(%5, meta[relay.Constant][1] /* ty=Tensor[(0), int8] */, ifm_scale=0.00783747f, ifm_zero_point=-128, ofm_scale=0.00392157f, ofm_zero_point=-128) /* ty=Tensor[(1, 2, 3, 4), int8] */ } /* ty=fn (Tensor[(48), int8]) -> Tensor[(1, 2, 3, 4), int8] */

in this cases the scales are different in others the same.

Ok cool, thanks for clarifying :)

Does not fuse min and max operations with requantize if there are different scales as it is not supported on NPU. Since there are hardware constraints, we cannot perform min or max operation fused with requantize (please look at NPU_SET_OFM_SCALE register description https://developer.arm.com/documentation/102420/0200/Programmers-model/Command-stream/cmd1-commands-) when we have different scales. min/max operations with matching scales are offloaded to NPU as ethosu_binary_elementwise min/max operations with different scales are offloaded to NPU as ethosu_binary_elementwise + ethosu_identity

…MIN/MAX + CLIP

ekalda

Thanks @alexey-yazev, LGTM!

) Does not fuse min and max operations with requantize if there are different scales as it is not supported on NPU. Since there are hardware constraints, we cannot perform min or max operation fused with requantize (please look at NPU_SET_OFM_SCALE register description https://developer.arm.com/documentation/102420/0200/Programmers-model/Command-stream/cmd1-commands-) when we have different scales. min/max operations with matching scales are offloaded to NPU as ethosu_binary_elementwise min/max operations with different scales are offloaded to NPU as ethosu_binary_elementwise + ethosu_identity

github-actions bot requested a review from leandron January 12, 2023 14:28

ekalda requested changes Jan 16, 2023

View reviewed changes

Aleksei-grovety added 3 commits January 18, 2023 11:39

fix lines length

c28aee2

Add separate test to check case when requantize cannot be fused with …

c07d019

…MIN/MAX + CLIP

Aleksei-grovety force-pushed the ethosu-binary-elementwise-bugfix branch from d3e9229 to c07d019 Compare January 18, 2023 07:45

ekalda approved these changes Jan 18, 2023

View reviewed changes

ekalda merged commit 60358a1 into apache:main Jan 18, 2023

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[microNPU] Add hardware constraints for binary elementwise #13772

[microNPU] Add hardware constraints for binary elementwise #13772

Aleksei-grovety commented Jan 12, 2023

tvm-bot commented Jan 12, 2023

ekalda left a comment

ekalda Jan 16, 2023

Aleksei-grovety Jan 17, 2023

ekalda Jan 16, 2023

Aleksei-grovety Jan 17, 2023

ekalda Jan 18, 2023

ekalda left a comment

	@pytest.mark.parametrize("activation_function", [None, tf.nn.relu, tf.nn.relu6, relu_n1_to_1])
	@pytest.mark.parametrize("activation_function", [None, tf.nn.relu])

[microNPU] Add hardware constraints for binary elementwise #13772

[microNPU] Add hardware constraints for binary elementwise #13772

Conversation

Aleksei-grovety commented Jan 12, 2023

tvm-bot commented Jan 12, 2023

ekalda left a comment

Choose a reason for hiding this comment

ekalda Jan 16, 2023

Choose a reason for hiding this comment

Aleksei-grovety Jan 17, 2023

Choose a reason for hiding this comment

ekalda Jan 16, 2023

Choose a reason for hiding this comment

Aleksei-grovety Jan 17, 2023

Choose a reason for hiding this comment

ekalda Jan 18, 2023

Choose a reason for hiding this comment

ekalda left a comment

Choose a reason for hiding this comment