Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mxnet parser for Qnn dialect #4714

Merged
merged 23 commits into from
Feb 5, 2020
Merged

Conversation

shoubhik
Copy link
Contributor

In this parser, I am including the changes needed for QNN dialect for Mxnet. It has been tested with most of the open-source Mxnet models.

@shoubhik
Copy link
Contributor Author

@anijain2305 , @jackwish can you take a look at this.

Copy link
Member

@liangfu liangfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing the favorable change.

Please add test scripts following test_qnn_ops_utils.py.

@@ -14,12 +14,14 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, import-self, len-as-condition, no-else-return
# pylint: disable=invalid-name, import-self, len-as-condition, no-else-return, too-many-lines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid disabling linter in head of a file. Instead, you can disable them in specific functions or classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular one is about the file, where pylint complains about too many lines in the file.


zero_centered_uint8_quantized_range = np.float32(255)
zero_centered_int8_quantized_range = np.float32(127)
zero_centered_uint8_quantized_range = np.float32(255.5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why are we adding an extra 0.5 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why are we adding an extra 0.5 here?

I guess this is related to the rounding policy and the quantization arithmetic?

src/relay/qnn/op/convolution.cc Outdated Show resolved Hide resolved
data_layout = attrs.get_str("layout", "NCHW")
if len(kernel_size) != 2:
raise tvm.error.OpAttributeInvalid(
'Non 1D or 2D kernels are not supported for operator Convolution')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 2D kernels are supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we do have support for 1d and 3d conv in topi now.

Copy link
Contributor Author

@shoubhik shoubhik Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems to be from an older version of mxnet. It makes more sense to put this change in a separate PR after more extensive testing.
Latest mxnet https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/convolution.cc#L364-L368 does support 1,2,3 dim kernels

@@ -1075,6 +1098,377 @@ def _mx_cond(inputs, attrs, subgraphs):
return ret


def _qnn_mx_contrib_quantize(inputs, attrs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put functions with _qnn_mx prefix into mxnet_qnn_op_utils.py file as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the ops so that they no longer have mx prefix in them.

- Removing redundant commented code.
- Removing unused methods.
@masahi
Copy link
Member

masahi commented Jan 17, 2020

it would be great if you could add an test case on end to end network translation like in our tflite test case.

# Conflicts:
#	src/relay/qnn/op/convolution.cc
Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into some code, however, as I am not familiar with these MxNet code, and not very understand of the naming (codebase issue maybe), have only these comments so far... Hoping some other can further check :)

def _mx_fully_connected(inputs, attrs):
import mxnet as mx
import mxnet as mx #pylint: disable=import-outside-toplevel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running the pylint locally i was getting this warning.

@@ -1033,6 +1054,7 @@ def _mx_contrib_fifo_buffer(inputs, attrs):
new_attrs['axis'] = attrs.get_int('axis')
return _op.nn.fifo_buffer(*inputs, **new_attrs)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed?

out_dtype = 'int8'
else:
out_dtype = out_type
if out_dtype not in {'int8', 'uint8'}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this assert checks out_type actually? If that is the case, can we move it to where we obtained the out_type value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_dtype can be auto too. in which case we infer it based on the min and max values.

Comment on lines +1113 to +1114
min_calib_range = attrs.get_float('min_calib_range', 0.0)
max_calib_range = attrs.get_float('max_calib_range', 0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting, if there is not min/max_calib_range provided, we are expecting the range to be [0.0, 0.0)? I am not very familiar with MxNet calibration method, but it appears to me that the detault min/max can derive from out_dtype?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. The zeros should not be there. In fact there min and max operators. i'll fix this.


zero_centered_uint8_quantized_range = np.float32(255)
zero_centered_int8_quantized_range = np.float32(127)
zero_centered_uint8_quantized_range = np.float32(255.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why are we adding an extra 0.5 here?

I guess this is related to the rounding policy and the quantization arithmetic?

@shoubhik
Copy link
Contributor Author

it would be great if you could add an test case on end to end network translation like in our tflite test case.

I have created a post on discuss on how we can do this - https://discuss.tvm.ai/t/use-mxnet-mkldnn-distribution-in-ci-instead-of-stock-mxnet-distribution/5474

@anijain2305
Copy link
Contributor

anijain2305 commented Jan 21, 2020

With the latest commit, we have good accuracy for all MxNet-MKLDNN quantized models

@tmoreau89 @yzhliu @tqchen @jackwish @FrozenGene @liangfu @vinx13

-- Mxnet-Top1 Mxnet-Top5 TVM-Top1 TVM-Top5 Degradation-Top1 Degradation-Top5
Resnet18_v1 69.76 89.02 69.85 89.09 -0.09 -0.07
Resnet50_v1 76.13 92.6 75.9 92.66 0.23 -0.06
Resnet50_v1b 76.66 92.6 76.45 92.57 0.21 0.03
Resnet101_v1 77.13 93.06 77 93.06 0.13 0
Resnet152_v2 75.99 92.52 75.32 92.26 0.67 0.26
Inception-V3 77.84 93.52 77.28 93.32 0.56 0.2
Inception-BN 71.96 90.38 71.79 90.25 0.17 0.13
MobileNetV1 71.27 90.09 71.13 90.16 0.14 -0.07
MobileNetV2 70.35 89.45 70.19 89.5 0.16 -0.05

Quantized models are generated using this - https://github.com/apache/incubator-mxnet/tree/master/example/quantization

Accuracy is collected over 10,000 inputs with target = 'llvm'

@masahi masahi self-assigned this Jan 26, 2020
@shoubhik
Copy link
Contributor Author

shoubhik commented Feb 3, 2020

@tmoreau89 @yzhliu @tqchen @jackwish @FrozenGene @liangfu @vinx13 @anijain2305 I want to discuss this PR from testability POV. Mxnet suggests to use MKLDNN as backend for quantization. The operators for MKLDNN quantization in Mxnet is different from stock Mxnet quantization operators. Also, for some ops the implementation of these ops are also different.
Due to this PR #4753 and #4764, it is not possible to merge the Mxnet-MKLDNN installation in TVM docker. One potential solution is to test stock quantization operators but that would be extra work that may not be too useful in future. All the testing we have done and the edge cases we have covered are for the MKLDNN operators.
My suggestion is at this point we check in the parser code along with the the test scripts we have for benchmarking the QNN networks. When MKLDNN issues are fixed and we can safely upgrade we can add test cases at that point.

@anijain2305
Copy link
Contributor

Given the constraints, I am ok with MxNet-parser code w/o any MxNet-MKLDNN CI testing, as long as we have a test script linked in the PR, that somebody can locally use to test the changes.

Once Mxnet-MKLDNN CI/accuracy issues are resolved, we can re-write that script as a test.

Is that ok - @tqchen @icemelon9

@FrozenGene
Copy link
Member

@tmoreau89 @yzhliu @tqchen @jackwish @FrozenGene @liangfu @vinx13 @anijain2305 I want to discuss this PR from testability POV. Mxnet suggests to use MKLDNN as backend for quantization. The operators for MKLDNN quantization in Mxnet is different from stock Mxnet quantization operators. Also, for some ops the implementation of these ops are also different.
Due to this PR #4753 and #4764, it is not possible to merge the Mxnet-MKLDNN installation in TVM docker. One potential solution is to test stock quantization operators but that would be extra work that may not be too useful in future. All the testing we have done and the edge cases we have covered are for the MKLDNN operators.
My suggestion is at this point we check in the parser code along with the the test scripts we have for benchmarking the QNN networks. When MKLDNN issues are fixed and we can safely upgrade we can add test cases at that point.

I am OK with this.

@shoubhik
Copy link
Contributor Author

shoubhik commented Feb 4, 2020

@tqchen @yzhliu I think the code can be merged now.

_, arg_params, aux_params = get_mxnet_output(mx_symbol, x)
quantize_api = quantize_with_old_api if use_old_mxnet_quantization_api() else quantized_with_new_api


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this white spaces

@masahi
Copy link
Member

masahi commented Feb 4, 2020

@shoubhik I'll have a look and merge. I'm interested in this PR.

@masahi masahi merged commit 7d263c3 into apache:master Feb 5, 2020
@masahi
Copy link
Member

masahi commented Feb 5, 2020

thanks @shoubhik @anijain2305 @jackwish @FrozenGene @liangfu this is merged.

@masahi
Copy link
Member

masahi commented Feb 9, 2020

My suggestion is at this point we check in the parser code along with the the test scripts we have for benchmarking the QNN networks

Where are that test scripts? I want to run them @shoubhik @anijain2305

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020
* - Additional util methods needed for mxnet frontend for qnn dialect.

* - Fixing call to quantize.

* [QNN] MxNet-MKLDNN parser support for QNN

* [QNN] Relax conv check.

* - Merge from origin

* [QNN] Channel wise changes

* [QNN] Dense changes

* Dense fix for QNN ops.

* - Removed non-mkl code from utils.

- Small refactoring

- Remove "with_sum" from conv

- Simplified code

* - Fixing ring buffer name.

* - Fixing pylint issues.

* - Fixing lint
- Removing redundant commented code.

* - Adding test cases
- Removing unused methods.

* [WIP] end to end test case for mxnet qnn parser

* Changes to parse large CV models.

* Pylint issues.

* Fix Conv2D with sum and quantized pooling.

* Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn.

Co-authored-by: Animesh Jain <anijain@umich.edu>
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020
* - Additional util methods needed for mxnet frontend for qnn dialect.

* - Fixing call to quantize.

* [QNN] MxNet-MKLDNN parser support for QNN

* [QNN] Relax conv check.

* - Merge from origin

* [QNN] Channel wise changes

* [QNN] Dense changes

* Dense fix for QNN ops.

* - Removed non-mkl code from utils.

- Small refactoring

- Remove "with_sum" from conv

- Simplified code

* - Fixing ring buffer name.

* - Fixing pylint issues.

* - Fixing lint
- Removing redundant commented code.

* - Adding test cases
- Removing unused methods.

* [WIP] end to end test case for mxnet qnn parser

* Changes to parse large CV models.

* Pylint issues.

* Fix Conv2D with sum and quantized pooling.

* Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn.

Co-authored-by: Animesh Jain <anijain@umich.edu>
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020
* - Additional util methods needed for mxnet frontend for qnn dialect.

* - Fixing call to quantize.

* [QNN] MxNet-MKLDNN parser support for QNN

* [QNN] Relax conv check.

* - Merge from origin

* [QNN] Channel wise changes

* [QNN] Dense changes

* Dense fix for QNN ops.

* - Removed non-mkl code from utils.

- Small refactoring

- Remove "with_sum" from conv

- Simplified code

* - Fixing ring buffer name.

* - Fixing pylint issues.

* - Fixing lint
- Removing redundant commented code.

* - Adding test cases
- Removing unused methods.

* [WIP] end to end test case for mxnet qnn parser

* Changes to parse large CV models.

* Pylint issues.

* Fix Conv2D with sum and quantized pooling.

* Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn.

Co-authored-by: Animesh Jain <anijain@umich.edu>
shoubhik added a commit to shoubhik/incubator-tvm that referenced this pull request May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants