[TFLite] QNN support for TFLite 2.1.0 quantized models #5848

anijain2305 · 2020-06-18T23:24:29Z

TFLite 2.1.0 has changed the quantization specs from older TFLite versions. The specs are present at - https://www.tensorflow.org/lite/performance/quantization_spec

Key changes

No TFLite2.0 hosted models. Need a way to create quantized models. Dependency on tensorflow_hub package (@tqchen
please advise here).
Per-axis (aka channel) quantization
Int8 datatypes
Extra restrictions per op regarding scale and zero points
Quantize TFLite op also works as Requantize

anijain2305 · 2020-06-24T18:01:44Z

Opening the PR for review. @siju-samuel @masahi @u99127 @tqchen

The CI fails because of a missing package - tensorflow_hub

u99127

Very quick review before I disappear for the day.

I think this is on the right track but needs some cleanups and some clarifications.

Dmitriy @d-smirnov, could you also casting your eye over this since you've been looking at this internally ?

u99127 · 2020-06-24T18:36:46Z

python/tvm/relay/frontend/tflite.py

+                        scale = tflite_scale
+                        # Ensure that all zero points are identical
+                        zero_point = tflite_zero_point
+                        assert all(x == zero_point[0] for x in zero_point)


Minor Nit : Can we use an error here instead of an assert to show us clearly the change that has happened ? It also means we can provide some sensible diagnostic ?

u99127 · 2020-06-24T18:37:15Z

python/tvm/relay/frontend/tflite.py

+                # Params might be per-tensor or per-axis quantized. For per-tensor, scale and zero
+                # points are scalar. For per-axis, scale and zero points are tensors. But as per
+                # TFLite quantization spec, the restrictions on ops suggest that for per-axis, even
+                # if zero point is a tensor - all the zero points are identical.  More infomration
+                # here - https://www.tensorflow.org/lite/performance/quantization_spec


To be clear, we are interpreting this from the fact that Conv2d and Depthwise_conv2d have a zero_point of 0 listed in their restriction even though they have per-axis quantization.

I would make the comment more explicit .

For per-axis or per-channel quantization the scale and zero points for the weights are tensors (?)

u99127 · 2020-06-24T18:45:11Z

python/tvm/relay/frontend/tflite.py

        input_tensors = self.get_input_tensors(op)
        assert len(input_tensors) == 1, "input tensors length should be 1"
        input_tensor = input_tensors[0]
+        input_tensor_type_str = self.get_tensor_type_str(input_tensor.tensor.Type())
        in_expr = self.get_expr(input_tensor.tensor_idx)

        output_tensors = self.get_output_tensors(op)
        assert len(output_tensors) == 1, "output tensors length should be 1"
        output_tensor = output_tensors[0]
+        output_tensor_type_str = self.get_tensor_type_str(output_tensor.tensor.Type())

        # The output must be quantized
        assert output_tensor.qnn_params
-        # Quantize the input
-        out = self.quantize(in_expr, output_tensor)

+        # TFLite Quantize op can also act as Requantize op
+        if input_tensor_type_str == "float32":
+            out = self.quantize(in_expr, output_tensor)
+        else:
+            out = _qnn.op.requantize(in_expr,
+                                     input_scale=input_tensor.qnn_params['scale'],
+                                     input_zero_point=input_tensor.qnn_params['zero_point'],
+                                     output_scale=output_tensor.qnn_params['scale'],
+                                     output_zero_point=output_tensor.qnn_params['zero_point'],
+                                     out_dtype=output_tensor_type_str)
        return out


This to me looks like it can go in by it's own right as a separate PR but this needs a unit test change in tflite/test_forward.py .

You are right. I will add a test case in this PR. This will enable us to keep those 5 end to end tests as well.

Above test case added to force both types of quantize nodes

u99127 · 2020-06-24T18:57:38Z

python/tvm/relay/frontend/tflite.py

+        shape = tensor_wrapper.tensor.ShapeAsNumpy()
+
+        # Set shape to 1 if the data is a scalar type
+        if data.shape == (1,) and isinstance(shape, int) and shape == 0:


I'm scratching my head at this condition with shape. Can you elaborate more as to why we need it ?

I modified the comments. Can you take a look now? This is sort of a corner case when the TFLite buffer is a scalar.

anijain2305 · 2020-06-24T19:54:57Z

Thanks @u99127 for the review. All make sense. I will add clarifications.

python/tvm/relay/frontend/tflite.py

tests/python/frontend/tflite/test_forward.py

python/tvm/relay/frontend/tflite.py

anijain2305 · 2020-06-29T15:48:36Z

@u99127 @siju-samuel CI passes now. Can you please take another look? There were more changes, mostly to add unit level tests.

siju-samuel · 2020-06-30T09:43:41Z

python/tvm/relay/frontend/tflite.py

+                    zero_point=zero_point_val,
+                    dtype=output_tensor_type_str)
+        else:
+            out = _op.clip(in_expr, a_min=0, a_max=6)


This should be relu, not clip

siju-samuel · 2020-06-30T09:58:50Z

tests/python/frontend/tflite/test_forward.py

-    # Define a dummy model
+    # Keras model to force TFLite converter to insert 2 TFLite quantize ops.
+    # First TFLite quantize op converts float32 tensor to int8 tensor - Qnn quantize.
+    # Second TLite quantize op converts int8 tensor to int8 tensor - Qnn requantize.


TLite -> TFLite

siju-samuel · 2020-07-01T14:55:59Z

LGTM, Any more changes pending?

anijain2305 · 2020-07-01T15:14:22Z

@u99127 ping

u99127

Sorry about the delay - it's been a bit busy.

u99127 · 2020-07-01T18:36:07Z

python/tvm/relay/frontend/tflite.py

+        if data.size == 1 and isinstance(shape, int) and shape == 0:
+            shape = (1,)
+
+        if tensor_wrapper.tensor.Type() == TensorType.INT8:


Minor nit and this should really be credited to Dmitriy Smirnov. https://github.com/d-smirnov

the condition here could well be pulled out into a helper function that has a dictionary to help us map from TensorType to numpy type ?

Would make the code much cleaner and reduce duplication.

i.e. something like

def get_tensor_type_as_numpy(self, tensor_wrapper):
"""Returns np.dtype out of TensorType"""
"""Returns np.dtype out of TensorType"""
assert isinstance(tensor_wrapper, TensorWrapper)

try: from tflite.TensorType import TensorType return { TensorType.UINT8: np.uint8, TensorType.INT8: np.int8, TensorType.FLOAT32: np.float32, TensorType.INT32: np.int32, TensorType.INT64: np.int64, TensorType.BOOL: np.bool_ }[ tensor_wrapper.tensor.Type() ] except ImportError: raise ImportError("The tflite package must be installed") except KeyError: raise NotImplementedError("Tensor type '{}' currently not supported" .format(tensor_wrapper.tensor.Type())) def get_tensor_value(self, tensor_wrapper): """Get tensor buffer value from given tensor wrapper""" assert isinstance(tensor_wrapper, TensorWrapper) value_type = self.get_tensor_type_as_numpy( tensor_wrapper ) return np.frombuffer( tensor_wrapper.buffer.DataAsNumpy(), dtype=value_type ).reshape( tensor_wrapper.tensor.ShapeAsNumpy() \ if 0 != tensor_wrapper.tensor.ShapeLength() \ else [] )

Please consider following change as well:

def has_same_qnn_params(self, lhs_tensor, rhs_tensor): lhs_scale = lhs_tensor.qnn_params['scale'] rhs_scale = rhs_tensor.qnn_params['scale'] lhs_zero_point = lhs_tensor.qnn_params['zero_point'] rhs_zero_point = rhs_tensor.qnn_params['zero_point'] return np.allclose( lhs_scale.data.asnumpy(), rhs_scale.data.asnumpy(), rtol=1e-5, atol=1e-5 ) and \ np.allclose( lhs_zero_point.data.asnumpy(), rhs_zero_point.data.asnumpy(), rtol=1e-5, atol=1e-5 )

For the first comment, thanks, let me take a look.

For the second suggestion for has_same_qnn_params, I think we do not need that. For all the ops where we have to check params are same, they have scalar scale and zero point. This is because per-axis quantization is limited to weights, and thus limited to conv2d and dense op where we do not need this check.

u99127 · 2020-07-01T18:38:38Z

python/tvm/relay/frontend/tflite.py

+        try:
+            from tflite.ActivationFunctionType import ActivationFunctionType
+        except ImportError:
+            raise ImportError("The tflite package must be installed")
+


I think this is unnecessary given the import of ActivationFunctionType in the constructor here

Tried this but it failed, the scope of imports is limited to the functions in which they are imported.

u99127 · 2020-07-01T18:41:51Z

python/tvm/relay/frontend/tflite.py

@@ -692,6 +773,11 @@ def _hard_swish(data):

    def convert_relu6(self, op):
        """Convert TFLite ReLU6"""
+        try:
+            from tflite.ActivationFunctionType import ActivationFunctionType


Same as relu, I think this is unnecessary given the import of ActivationFunctionType in the constructor here

Same as before

u99127 · 2020-07-01T18:44:15Z

tests/python/frontend/tflite/test_forward.py

+        _test_convolution([4, 17, 17, 124], [1, 1, 124, 1], [1, 1], [1, 1], 'SAME', 'NHWC', True, quantized=quantized)
+        _test_convolution([4, 17, 17, 12], [3, 3, 12, 1], [1, 1], [2, 2], 'VALID', 'NHWC', True, quantized=quantized)
+        _test_convolution([4, 17, 17, 12], [3, 3, 12, 2], [1, 1], [2, 2], 'VALID', 'NHWC', True, quantized=quantized)
+        # dephtwise convolution with single input channel


dephtwise / depthwise.

anijain2305 · 2020-07-02T16:11:18Z

@u99127 can you please take a look again?

u99127 · 2020-07-02T17:32:14Z

@u99127 can you please take a look again?

LGTM.

u99127

LGTM and thanks for your patience.

anijain2305 · 2020-07-02T19:09:50Z

@siju-samuel Please approve and merge if its looks good :)

siju-samuel

LGTM.

siju-samuel · 2020-07-03T01:14:47Z

Thanks @anijain2305 @u99127 @tqchen. This is now merged.

* [TFLite] TFLite 2.x parser quantization support. * Address comments. Fix a bug for depthwise conv * Added tests for relu, conv, quantize. Address comments. * Using web-data. Minor refactoring. * Removing TF hub package * Trigger CI. * Handle TFLite input layer naming. * Addressing reviews. * Retrigger CI.

anijain2305 force-pushed the tf2_qnn branch 3 times, most recently from 0bbef96 to bd1cc7c Compare June 24, 2020 09:10

anijain2305 changed the title ~~[TFLite] TFLite 2.x parser quantization support.~~ [TFLite] QNN support for TFLite 2.1.0 quantized models Jun 24, 2020

anijain2305 marked this pull request as ready for review June 24, 2020 18:00

u99127 suggested changes Jun 24, 2020

View reviewed changes

siju-samuel reviewed Jun 25, 2020

View reviewed changes

python/tvm/relay/frontend/tflite.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/tflite.py Show resolved Hide resolved

python/tvm/relay/frontend/tflite.py Outdated Show resolved Hide resolved

anijain2305 force-pushed the tf2_qnn branch 3 times, most recently from 5aa8eac to 05fad24 Compare June 25, 2020 21:32

anijain2305 added 4 commits June 28, 2020 17:43

[TFLite] TFLite 2.x parser quantization support.

8b46271

Address comments. Fix a bug for depthwise conv

7f52fc8

Added tests for relu, conv, quantize. Address comments.

c7a198d

Using web-data. Minor refactoring.

914ef0a

anijain2305 force-pushed the tf2_qnn branch from 05fad24 to 914ef0a Compare June 28, 2020 18:16

anijain2305 added 3 commits June 29, 2020 00:39

Removing TF hub package

3534b01

Trigger CI.

e0cc5c7

Handle TFLite input layer naming.

cef3a85

siju-samuel reviewed Jun 30, 2020

View reviewed changes

u99127 suggested changes Jul 1, 2020

View reviewed changes

Addressing reviews.

0518b31

anijain2305 force-pushed the tf2_qnn branch from 0aab2ff to 0518b31 Compare July 1, 2020 20:28

Retrigger CI.

81fcbed

u99127 approved these changes Jul 2, 2020

View reviewed changes

tqchen approved these changes Jul 2, 2020

View reviewed changes

siju-samuel approved these changes Jul 3, 2020

View reviewed changes

siju-samuel merged commit 575a383 into apache:master Jul 3, 2020

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TFLite] QNN support for TFLite 2.1.0 quantized models #5848

[TFLite] QNN support for TFLite 2.1.0 quantized models #5848

anijain2305 commented Jun 18, 2020 •

edited

Loading

anijain2305 commented Jun 24, 2020

u99127 left a comment

u99127 Jun 24, 2020

u99127 Jun 24, 2020

u99127 Jun 24, 2020

anijain2305 Jun 24, 2020

anijain2305 Jun 25, 2020

u99127 Jun 24, 2020

anijain2305 Jun 25, 2020

anijain2305 commented Jun 24, 2020

anijain2305 commented Jun 29, 2020

siju-samuel Jun 30, 2020

siju-samuel Jun 30, 2020

siju-samuel commented Jul 1, 2020

anijain2305 commented Jul 1, 2020

u99127 left a comment

u99127 Jul 1, 2020

d-smirnov Jul 1, 2020

anijain2305 Jul 1, 2020

u99127 Jul 1, 2020

anijain2305 Jul 1, 2020

u99127 Jul 1, 2020

anijain2305 Jul 1, 2020

u99127 Jul 1, 2020

anijain2305 commented Jul 2, 2020

u99127 commented Jul 2, 2020

u99127 left a comment •

edited

Loading

anijain2305 commented Jul 2, 2020

siju-samuel left a comment

siju-samuel commented Jul 3, 2020

[TFLite] QNN support for TFLite 2.1.0 quantized models #5848

[TFLite] QNN support for TFLite 2.1.0 quantized models #5848

Conversation

anijain2305 commented Jun 18, 2020 • edited Loading

Key changes

anijain2305 commented Jun 24, 2020

u99127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anijain2305 commented Jun 24, 2020

anijain2305 commented Jun 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siju-samuel commented Jul 1, 2020

anijain2305 commented Jul 1, 2020

u99127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anijain2305 commented Jul 2, 2020

u99127 commented Jul 2, 2020

u99127 left a comment • edited Loading

Choose a reason for hiding this comment

anijain2305 commented Jul 2, 2020

siju-samuel left a comment

Choose a reason for hiding this comment

siju-samuel commented Jul 3, 2020

anijain2305 commented Jun 18, 2020 •

edited

Loading

u99127 left a comment •

edited

Loading