Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MKL-DNN QuantizedFullyConnectedOp Error #14467

Closed
Soonhwan-Kwon opened this issue Mar 19, 2019 · 33 comments
Closed

MKL-DNN QuantizedFullyConnectedOp Error #14467

Soonhwan-Kwon opened this issue Mar 19, 2019 · 33 comments
Labels
MKLDNN Quantization Issues/Feature Requests related to Quantization

Comments

@Soonhwan-Kwon
Copy link
Contributor

Soonhwan-Kwon commented Mar 19, 2019

Description

When using FusedRNNCell + MKLDNN backend: Graph optimization and Quantization (experimental), it leads to the QuantizedFullyConnectedOp Error like below,

MXNetError: Error in operator quantized_fusedrnn_t134_i2h: [11:40:16] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

and below is pseudo code for Network Architecture

stack = mx.rnn.FusedRNNCell(1760, num_layers=num_layers,
                                              mode=fused_rnn_mode, prefix='',
                                              bidirectional=bidirectional).unfuse()
net, _ = stack.unroll(length=seq_lengths_references[-1],
                            inputs=net,
                            merge_outputs=False,
                            layout='TNC'
                         )

Quantization Code
net = net.get_backend_symbol('MKLDNN')

qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params={}, aux_params={},ctx=mx.cpu(0), calib_mode='none', quantized_dtype='int8')

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e95ea) [0x7faf97f2b5ea]
[bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e9c11) [0x7faf97f2bc11]
[bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x9e351c) [0x7faf9852551c]
[bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x2deed5a) [0x7faf9a930d5a]
[bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x2df1704) [0x7faf9a933704]
[bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXSymbolInferShape+0x15ba) [0x7faf9a89e40a]
[bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fafff188ec0]
[bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fafff18887d]
[bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fafff39f8de]
[bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7fafff395b31]

commit head
mxnet-cu90mkl 1.4.0.post0

@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Build

@TaoLv
Copy link
Member

TaoLv commented Mar 19, 2019

@Soonhwan-Kwon Could you provide a simple reproducer?

Also, have you ever tried sym = sym.get_backend_symbol('MKLDNN_FC') and used quantized_dtype='uint8' when call quantize_model?
https://github.com/apache/incubator-mxnet/blob/master/example/quantization/imagenet_gen_qsym_mkldnn.py#L183

@Soonhwan-Kwon
Copy link
Contributor Author

Soonhwan-Kwon commented Mar 19, 2019

@TaoLv We tried your suggestion before,

$ echo $MXNET_SUBGRAPH_BACKEND
MKLDNN

sym = sym.get_backend_symbol('MKLDNN')
sym = sym.get_backend_symbol('MKLDNN_FC')

and it produces error like below
MXNetError Traceback (most recent call last)
in ()
1 sym = sym.get_backend_symbol('MKLDNN')
----> 2 sym = sym.get_backend_symbol('MKLDNN_FC')

/home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/symbol/symbol.pyc in get_backend_symbol(self, backend)
2454 """
2455 out = SymbolHandle()
-> 2456 check_call(_LIB.MXGenBackendSubgraph(self.handle, c_str(backend), ctypes.byref(out)))
2457 return Symbol(out)
2458

/home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/base.pyc in check_call(ret)
250 """
251 if ret != 0:
--> 252 raise MXNetError(py_str(_LIB.MXGetLastError()))
253
254

MXNetError: [16:35:10] src/c_api/../operator/subgraph/subgraph_property.h:165: Check failed: it != prop_fn_map_.end() SubgraphProperty MKLDNN_FC is not found in SubgraphPropertyRegistry

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e95ea) [0x7fc8dad265ea]
[bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e9c11) [0x7fc8dad26c11]
[bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXGenBackendSubgraph+0x40b) [0x7fc8dd6911fb]
[bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fc94220bec0]
[bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fc94220b87d]
[bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fc9424228de]
[bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7fc942418b31]
[bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x43) [0x7fc944fe2973]
[bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3bb9) [0x7fc945078d49]
[bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7e9) [0x7fc94507e6c9]

And we are working on a simple reproducer now, and will reply the code as soon as possible.

@Soonhwan-Kwon
Copy link
Contributor Author

@TaoLv And also quantized_dtype='uint8' produces the same original error message

MXNetError: Error in operator quantized_fusedrnn_t134_i2h: [11:40:16] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

@TaoLv
Copy link
Member

TaoLv commented Mar 19, 2019

May I know which version of MXNet you are using? MKL-DNN QFC is merged into master recently. PR here: #14128

@Soonhwan-Kwon
Copy link
Contributor Author

@TaoLv we tried version of 1.4.0.post0 which was the version before the commit, we'll try the latest version as you mentioned right now, thank you.

@pengzhao-intel
Copy link
Contributor

@Soonhwan-Kwon Thanks to reporting the issue.

@Amagong
Copy link

Amagong commented Mar 19, 2019

Hi.
I have tried the same problem. using "mxnet-cu90mkl 1.5.0b20190314"

First, I converted and saved a trained fused-rnn model.

import argparse
import os
import logging
import mxnet as mx
import gluoncv
from mxnet import gluon, nd, image
from gluoncv import utils
from gluoncv.model_zoo import get_model
from mxnet.contrib.quantization import *
from mxnet.base import SymbolHandle, check_call, _LIB, mx_uint, c_str_array
import ctypes


def save_symbol(fname, sym, logger=None):
    if logger is not None:
        logger.info('Saving symbol into file at %s' % fname)
    sym.save(fname)


def save_params(fname, arg_params, aux_params, logger=None):
    if logger is not None:
        logger.info('Saving params into file at %s' % fname)
    save_dict = {('arg:%s' % k): v.as_in_context(cpu()) for k, v in arg_params.items()}
    save_dict.update({('aux:%s' % k): v.as_in_context(cpu()) for k, v in aux_params.items()})
    mx.nd.save(fname, save_dict)
	
	
logging.basicConfig()
logger = logging.getLogger('logger')
logger.setLevel(logging.INFO)


prefix = 'fused_rnn'
dir_path = './checkpoints/'
prefix = os.path.join(dir_path, prefix)
epoch = 173
batch_size = 900

ctx = mx.cpu(0)

# load and convert
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)

sym = sym.get_backend_symbol('MKLDNN')
sym = sym.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []
excluded_sym_names += ['conv0']


logger.info('Quantizing FP32 model %s' % prefix)
qsym, qarg_params, aux_params = quantize_model(sym=sym, arg_params=arg_params, aux_params=aux_params, excluded_sym_names=excluded_sym_names, 
                                                ctx=ctx, calib_mode='none', quantized_dtype='uint8', logger=logger)
												
qsym = qsym.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qsym = qsym.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

sym_name = '%s-symbol.json' % (prefix + '-quantized')
param_name = '%s-%04d.params' % (prefix + '-quantized', epoch)

save_symbol(sym_name, qsym, logger)
save_params(param_name, qarg_params, aux_params, logger)

And, I loaded the converted symbols and the params file.

import numpy as np
import mxnet as mx
import os

q_prefix = 'fused_rnn-quantized'
dir_path = './checkpints/'
q_prefix = os.path.join(dir_path, q_prefix)
epoch = 173
batch_size = 900

contexts = [mx.context.Context('cpu')]

q_symbol_file = q_prefix + '-symbol.json'

q_symbol = mx.sym.load(q_symbol_file)

q_symbol.simple_bind(ctx=mx.cpu(), data=(900, 137, 9), category=(900, 2))

When tried simple_bind, it leads to the simple_bind error like below,

RuntimeError: simple_bind error. Arguments:
category: (900, 2)
data: (900, 137, 9)
[20:25:39] src/executor/../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x421cd2) [0x7ff6c90dfcd2]
[bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x4222b8) [0x7ff6c90e02b8]
[bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x31a10f1) [0x7ff6cbe5f0f1]
[bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::lessstd::string, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::unordered_map<std::string, mxnet::TShape, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, std::unordered_set<std::string, std::hashstd::string, std::equal_tostd::string, std::allocatorstd::string > const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::unordered_map<std::string, mxnet::NDArray, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor*, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x481) [0x7ff6cbe83101]
[bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::lessstd::string, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::vector<mxnet::Context, std::allocatormxnet::Context > const&, std::unordered_map<std::string, mxnet::TShape, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, std::unordered_set<std::string, std::hashstd::string, std::equal_tostd::string, std::allocatorstd::string > const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::unordered_map<std::string, mxnet::NDArray, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor*)+0x1d5) [0x7ff6cbe85835]
[bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXExecutorSimpleBind+0x2260) [0x7ff6cbdd42f0]
[bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7ff736c98ec0]
[bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7ff736c9887d]
[bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7ff736eaf8de]
[bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7ff736ea5b31]

@pengzhao-intel
Copy link
Contributor

@mxnet-label-bot add [MKLDNN, Quantization]

@marcoabreu marcoabreu added MKLDNN Quantization Issues/Feature Requests related to Quantization labels Mar 19, 2019
@ciyongch
Copy link
Contributor

@Soonhwan-Kwon is there any 0 dimension in the shape of input data? quantized Fullyconnected requires all the dimension of input data are given.

@pengzhao-intel
Copy link
Contributor

pengzhao-intel commented Mar 19, 2019

@Soonhwan-Kwon @Amagong could you provide a mini reproducible case so that we can help to resolve the issue?

Maybe you also need to patch #14466 after excluding the 0-dim layers.

Check failed: !shape_is_none(in_shape->at(0))

@pengzhao-intel
Copy link
Contributor

The PR #14466 is merged. Please sync up the latest MXNet and build again.

@Soonhwan-Kwon
Copy link
Contributor Author

Soonhwan-Kwon commented Mar 20, 2019

@pengzhao-intel Thank you for the update. I'm rebuilding the MXNet now and @Amagong and I are working on the same project. @ciyongch we excluded embedding layer(which seems has 0 dimension) but has no effect.

@ciyongch
Copy link
Contributor

@Soonhwan-Kwon Are you still facing the error of "Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given" or "InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:" ?
Can you provide a reproducer, then we can take a look :)

@Amagong
Copy link

Amagong commented Mar 20, 2019

@ciyongch Thank you for your quick response. I'll check with the newly built version.
And, I'll prepare a simple reproducer.

@Amagong
Copy link

Amagong commented Mar 20, 2019

There is currently an "InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:" error.

@ciyongch
Copy link
Contributor

@Amagong Still the same problem as original one. There's some layer has 0 dimension shape input, which is currently not supported by quantized FullyConnected operator. Please check your current model, and exclude all of this layer, I guess these layers are all comes from first time step. We're going to enhance the error message to help understand which operator is reporting this error.

@anirudh2290
Copy link
Member

anirudh2290 commented Mar 20, 2019

can you try

q_symbol.infer_shape_partial(data=(900, 137, 9), category=(900, 2))

First list should be correspsonding to q_symbol.list_arguments(), Second list should be corresponding to q_symbol.list_outputs(), third should be q_symbol.list_auxiliary_states().
This should indicate which shape is missing.

@Amagong
Copy link

Amagong commented Mar 21, 2019

@ciyongch @anirudh2290 Thank you for your reply.
@anirudh2290 'infer_shape_partial' works well without error, but still could not bind.

The error above was due to the use of fused-rnn.
Below is the simple reproduce code.

import math
import mxnet as mx
from mxnet.contrib.quantization import *

channel_num = 10
conv_layer_filter_dims = [2, 3]
conv_layer_strides = [1,1]
dimension = 5

data_len = 10

data = mx.sym.Variable('data')
label = mx.sym.Variable('label')

# layer stacking
net = mx.sym.Reshape(data=data, shape=(-4, -1, 1, 0, 0))

net = mx.sym.Convolution(data=net,
                         num_filter=channel_num,
                         kernel=tuple(conv_layer_filter_dims),
                         stride=tuple(conv_layer_strides),
                         weight=None,
                         bias=None,
                         no_bias=True,
                         cudnn_tune="fastest",
                         name="conv0")

net = mx.sym.BatchNorm(data=net,
                       eps=0.001,
                       momentum=0.9,
                       fix_gamma=False,
                       use_global_stats=False,
                       output_mean_var=False,
                       name="conv0_batchnorm"
                       )

data_lengths_references = int(math.floor((data_len - conv_layer_filter_dims[0]) / conv_layer_strides[0])) + 1

net = mx.sym.transpose(data=net, axes=(2, 0, 1, 3))
net = mx.sym.Reshape(data=net, shape=(0, 0, -3))



# Fused rnn :
stack = mx.rnn.FusedRNNCell(1024, num_layers=2, mode='rnn_relu', prefix='%s_l0' % ('gru'), bidirectional=False).unfuse()

# lstm : 
'''
stack = mx.rnn.SequentialRNNCell()
cell = mx.rnn.LSTMCell(num_hidden=1760, prefix='%s_l0l0_' % ('gru'))
stack.add(cell)
'''

# gru :
'''
stack = mx.rnn.SequentialRNNCell()
cell = mx.rnn.GRUCell(num_hidden=1760, prefix='%s_l0l0_' % ('gru'))
stack.add(cell)
'''


net, _ = stack.unroll(length=data_lengths_references,
                      inputs=net,
                      merge_outputs=False,
                      layout='TNC'
                     )

net = net[data_lengths_references-1]

net = mx.sym.FullyConnected(data=net, num_hidden=10, no_bias=False, name="classification_fc_layer")

net = mx.sym.SoftmaxOutput(data=net, label=label)

mod = net.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))

# convert to quantize model
net = net.get_backend_symbol('MKLDNN')
net = net.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []
excluded_sym_names += ['conv0']
 
arg_dict = mod.arg_dict
aux_dict = mod.aux_dict

arg_params = {}
aux_params = {}

for k, v in arg_dict.items():
        arg_params[k] = v
for k, v in aux_dict.items():
        aux_params[k] = v
    
qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params=arg_params, aux_params=aux_params,
         excluded_sym_names=excluded_sym_names, ctx=mx.cpu(0), calib_mode='none', quantized_dtype='uint8')

qnet = qnet.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qnet = qnet.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

print(qnet.infer_shape(data=(75, data_len, dimension)))
qnet.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))

When the comment is removed for the lstm or fused-rnn block, the following UserWarning is occurs.

UserWarning: Cannot decide shape for the following arguments (0s in shape means unknown dimensions). Consider providing them as input:

And, in bind time, the following error occurs.

InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:

@Amagong
Copy link

Amagong commented Mar 21, 2019

The same error occurs when using RNNCell.
There is no error when using GRUCell.
Is there a problem with how to use it?

@stu1130
Copy link
Contributor

stu1130 commented Mar 22, 2019

@anirudh2290 and I tried to debug the issue. When I see the shape of gru_l0l0_begin_state_0 in the graph is (0,1024) and following by quantized_fully_connected, the zero dimension of gru_1010 is not been inferred and we need to dive deeper

@ciyongch
Copy link
Contributor

@Amagong Excluding the layers with 0 dimension input will resolve this error. In your samples, the input to h2h in the first timestep (0) of all the layers contains 0 shape, just exclude these layers as below:
For Fused-rnn block:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0_h2h',
+  'gru_l0l1_t0_h2h',
+  ]

For lstm block:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0_h2h',
+  ]

For gru block, I noticed that there's another '_' in gru naming:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0__h2h',
+  ]

Beside that, please change simple_bind() to bind() since quantized symbol requires quantized_params (int8). while simple_bind() will allocated default params which is in fp32.

-qnet.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))
+mod = mx.mod.Module(symbol=qnet, context=mx.cpu(0), label_names=None)
+mod.bind(data_shapes=[('data', (75, data_len, dimension))], grad_req='null')
+mod.set_params(qarg_params, qaux_params)

Hope this will help your to enable the case :)

@anirudh2290
Copy link
Member

Thanks @ciyongch . Can you please let me know why quantized_fully_connected doesn't handle inferring the data dimension 0 based on the output shape. For example, the following runs fine on fp32:

import mxnet as mx
qdtype="float32"
num_hidden=100
no_bias=False
flatten=True
x = mx.sym.var("x", dtype=qdtype)
qdata = mx.sym.Variable(name='qdata')#, shape=data_shape, dtype=qdtype)
qbias = mx.sym.Variable(name='qbias')#, shape=(10, 100), dtype=qdtype)
y = mx.sym.exp(x)
fc_fp32 = mx.sym.FullyConnected(data=qdata, num_hidden=num_hidden, no_bias=no_bias, flatten=flatten)
sum_first = mx.sym.elemwise_add(y, fc_fp32)
sum_first_1 = mx.sym.Group([sum_first, x, y])
ex = sum_first_1.simple_bind(mx.cpu(), qdata=(0, 1024), fullconnected0_weight=(100, 1024), fullyconnected0_bias=(100,), x=(10, 100))
print(ex.arg_dict["qdata"].shape)

Expectation is after quantization also it should run fine. But it fails at this check.
Is there any reason why we cant remove the check here:
https://github.com/apache/incubator-mxnet/blob/master/src/operator/quantization/quantized_fully_connected.cc#L50
and add a inference from output to input like in non quantized fully connected here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/fully_connected.cc#L78

@ciyongch
Copy link
Contributor

@anirudh2290 The behavior was not changed since the initial version, looks like it will throw many errors in rnn domain. Will figure out the reason and see how to improve this :)

@Amagong
Copy link

Amagong commented Mar 25, 2019

Thanks @ciyongch I follow your guide, no more errors occur.
But... There are still some problems in successfully applying Quantization to my code.
I'll try various ways to apply to my code.
Thank you.

@ciyongch
Copy link
Contributor

@Amagong Glad to here you're able to run quantization on the sample code. Please let us know if you met other errors/failures in your real case. We're working on enhancement for this limitation..

@Amagong
Copy link

Amagong commented Mar 25, 2019

@ciyongch In my case, there is a problem that inference time is slow when using quantization.
Originally it took 2 minutes 40 seconds, it takes 24 minutes after quantization....

I generate a network like the sample code above and use the 'quantize_model' function.

# generate symbol
net = gen_sym(data_len)

net = net.get_backend_symbol('MKLDNN')
net = net.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []

excluded_sym_names += ['conv0']
excluded_sym_names += ['gru_l0l0_t0_h2h']
excluded_sym_names += ['gru_l0l1_t0_h2h']

save_dict = mx.nd.load('original_model.params')

arg_params = {}
aux_params = {}

for k, v in save_dict.items():
	tp, name = k.split(':', 1)
	if tp == 'arg':
		arg_params[name] = v
	if tp == 'aux':
		aux_params[name] = v

qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params=arg_params, aux_params=aux_params,
 excluded_sym_names=excluded_sym_names, ctx=mx.cpu(0), calib_mode='none', quantized_dtype='uint8')

qnet = qnet.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qnet = qnet.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

return qnet

And set parameters as below.

_, arg_params, aux_params = mx.model.load_checkpoint('quantizede model path', model_epoch_num)
model.set_params(arg_params, aux_params)

I use this structure because input data length is variable.

When I run the inference code like above, it runs without any problem, but it is too slow....
I'm looking for a problem with the my code.
Can I get some advice..?

@Amagong
Copy link

Amagong commented Mar 25, 2019

I'm using 'FusedRNNCell'

@ZhennanQin
Copy link
Contributor

@Amagong The main reason is, you're using quantized model without calibration information.
This will result in online calibration and will slow down the performance dramatically.
To get full speed of quantized model, we suggest to adopt any of calib_mode(naive or entropy).

@Amagong
Copy link

Amagong commented Mar 25, 2019

@ZhennanQin Thank you for your advice! I'll try the way you told me.

@pengzhao-intel
Copy link
Contributor

@Amagong @Soonhwan-Kwon did you get the expected results?
We'd like to know some feedbacks and continuously improve the INT8 flow and quality :)

@pengzhao-intel
Copy link
Contributor

pengzhao-intel commented May 22, 2019

PR #15031 will fix this issue

@pengzhao-intel
Copy link
Contributor

Closing the issue since the PR is merged. Feel free to reopen if you see the issue again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
MKLDNN Quantization Issues/Feature Requests related to Quantization
Projects
None yet
Development

No branches or pull requests

10 participants