-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mxnet parser for Qnn dialect #4714
Conversation
…to qnn-mxnet-parser # Conflicts: # python/tvm/relay/frontend/mxnet_qnn_op_utils.py - Working code.
- Small refactoring - Remove "with_sum" from conv - Simplified code
@anijain2305 , @jackwish can you take a look at this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing the favorable change.
Please add test scripts following test_qnn_ops_utils.py.
@@ -14,12 +14,14 @@ | |||
# KIND, either express or implied. See the License for the | |||
# specific language governing permissions and limitations | |||
# under the License. | |||
# pylint: disable=invalid-name, import-self, len-as-condition, no-else-return | |||
# pylint: disable=invalid-name, import-self, len-as-condition, no-else-return, too-many-lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid disabling linter in head of a file. Instead, you can disable them in specific functions or classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular one is about the file, where pylint complains about too many lines in the file.
|
||
zero_centered_uint8_quantized_range = np.float32(255) | ||
zero_centered_int8_quantized_range = np.float32(127) | ||
zero_centered_uint8_quantized_range = np.float32(255.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why are we adding an extra 0.5 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why are we adding an extra 0.5 here?
I guess this is related to the rounding policy and the quantization arithmetic?
python/tvm/relay/frontend/mxnet.py
Outdated
data_layout = attrs.get_str("layout", "NCHW") | ||
if len(kernel_size) != 2: | ||
raise tvm.error.OpAttributeInvalid( | ||
'Non 1D or 2D kernels are not supported for operator Convolution') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only 2D kernels are supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually we do have support for 1d and 3d conv in topi now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition seems to be from an older version of mxnet. It makes more sense to put this change in a separate PR after more extensive testing.
Latest mxnet https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/convolution.cc#L364-L368 does support 1,2,3 dim kernels
python/tvm/relay/frontend/mxnet.py
Outdated
@@ -1075,6 +1098,377 @@ def _mx_cond(inputs, attrs, subgraphs): | |||
return ret | |||
|
|||
|
|||
def _qnn_mx_contrib_quantize(inputs, attrs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put functions with _qnn_mx
prefix into mxnet_qnn_op_utils.py file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the ops so that they no longer have mx prefix in them.
- Removing redundant commented code.
- Removing unused methods.
it would be great if you could add an test case on end to end network translation like in our tflite test case. |
# Conflicts: # src/relay/qnn/op/convolution.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into some code, however, as I am not familiar with these MxNet code, and not very understand of the naming (codebase issue maybe), have only these comments so far... Hoping some other can further check :)
def _mx_fully_connected(inputs, attrs): | ||
import mxnet as mx | ||
import mxnet as mx #pylint: disable=import-outside-toplevel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running the pylint locally i was getting this warning.
@@ -1033,6 +1054,7 @@ def _mx_contrib_fifo_buffer(inputs, attrs): | |||
new_attrs['axis'] = attrs.get_int('axis') | |||
return _op.nn.fifo_buffer(*inputs, **new_attrs) | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed?
out_dtype = 'int8' | ||
else: | ||
out_dtype = out_type | ||
if out_dtype not in {'int8', 'uint8'}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this assert checks out_type
actually? If that is the case, can we move it to where we obtained the out_type
value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out_dtype can be auto too. in which case we infer it based on the min and max values.
min_calib_range = attrs.get_float('min_calib_range', 0.0) | ||
max_calib_range = attrs.get_float('max_calib_range', 0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting, if there is not min/max_calib_range
provided, we are expecting the range to be [0.0, 0.0)
? I am not very familiar with MxNet calibration method, but it appears to me that the detault min/max can derive from out_dtype
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. The zeros should not be there. In fact there min and max operators. i'll fix this.
|
||
zero_centered_uint8_quantized_range = np.float32(255) | ||
zero_centered_int8_quantized_range = np.float32(127) | ||
zero_centered_uint8_quantized_range = np.float32(255.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why are we adding an extra 0.5 here?
I guess this is related to the rounding policy and the quantization arithmetic?
I have created a post on discuss on how we can do this - https://discuss.tvm.ai/t/use-mxnet-mkldnn-distribution-in-ci-instead-of-stock-mxnet-distribution/5474 |
With the latest commit, we have good accuracy for all MxNet-MKLDNN quantized models @tmoreau89 @yzhliu @tqchen @jackwish @FrozenGene @liangfu @vinx13
Quantized models are generated using this - https://github.com/apache/incubator-mxnet/tree/master/example/quantization Accuracy is collected over 10,000 inputs with target = 'llvm' |
@tmoreau89 @yzhliu @tqchen @jackwish @FrozenGene @liangfu @vinx13 @anijain2305 I want to discuss this PR from testability POV. Mxnet suggests to use MKLDNN as backend for quantization. The operators for MKLDNN quantization in Mxnet is different from stock Mxnet quantization operators. Also, for some ops the implementation of these ops are also different. |
Given the constraints, I am ok with MxNet-parser code w/o any MxNet-MKLDNN CI testing, as long as we have a test script linked in the PR, that somebody can locally use to test the changes. Once Mxnet-MKLDNN CI/accuracy issues are resolved, we can re-write that script as a test. Is that ok - @tqchen @icemelon9 |
I am OK with this. |
_, arg_params, aux_params = get_mxnet_output(mx_symbol, x) | ||
quantize_api = quantize_with_old_api if use_old_mxnet_quantization_api() else quantized_with_new_api | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this white spaces
@shoubhik I'll have a look and merge. I'm interested in this PR. |
…ache#4753, mxnet could not be updated to mxnet-mkldnn.
thanks @shoubhik @anijain2305 @jackwish @FrozenGene @liangfu this is merged. |
Where are that test scripts? I want to run them @shoubhik @anijain2305 |
* - Additional util methods needed for mxnet frontend for qnn dialect. * - Fixing call to quantize. * [QNN] MxNet-MKLDNN parser support for QNN * [QNN] Relax conv check. * - Merge from origin * [QNN] Channel wise changes * [QNN] Dense changes * Dense fix for QNN ops. * - Removed non-mkl code from utils. - Small refactoring - Remove "with_sum" from conv - Simplified code * - Fixing ring buffer name. * - Fixing pylint issues. * - Fixing lint - Removing redundant commented code. * - Adding test cases - Removing unused methods. * [WIP] end to end test case for mxnet qnn parser * Changes to parse large CV models. * Pylint issues. * Fix Conv2D with sum and quantized pooling. * Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn. Co-authored-by: Animesh Jain <anijain@umich.edu>
* - Additional util methods needed for mxnet frontend for qnn dialect. * - Fixing call to quantize. * [QNN] MxNet-MKLDNN parser support for QNN * [QNN] Relax conv check. * - Merge from origin * [QNN] Channel wise changes * [QNN] Dense changes * Dense fix for QNN ops. * - Removed non-mkl code from utils. - Small refactoring - Remove "with_sum" from conv - Simplified code * - Fixing ring buffer name. * - Fixing pylint issues. * - Fixing lint - Removing redundant commented code. * - Adding test cases - Removing unused methods. * [WIP] end to end test case for mxnet qnn parser * Changes to parse large CV models. * Pylint issues. * Fix Conv2D with sum and quantized pooling. * Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn. Co-authored-by: Animesh Jain <anijain@umich.edu>
* - Additional util methods needed for mxnet frontend for qnn dialect. * - Fixing call to quantize. * [QNN] MxNet-MKLDNN parser support for QNN * [QNN] Relax conv check. * - Merge from origin * [QNN] Channel wise changes * [QNN] Dense changes * Dense fix for QNN ops. * - Removed non-mkl code from utils. - Small refactoring - Remove "with_sum" from conv - Simplified code * - Fixing ring buffer name. * - Fixing pylint issues. * - Fixing lint - Removing redundant commented code. * - Adding test cases - Removing unused methods. * [WIP] end to end test case for mxnet qnn parser * Changes to parse large CV models. * Pylint issues. * Fix Conv2D with sum and quantized pooling. * Reverting the changes made for mxnet-mkldnn test cases. Because of apache#4753, mxnet could not be updated to mxnet-mkldnn. Co-authored-by: Animesh Jain <anijain@umich.edu>
In this parser, I am including the changes needed for QNN dialect for Mxnet. It has been tested with most of the open-source Mxnet models.