Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MXNet model-r50-am-lfw] MXNet to TensorFlow: a variable (named id) generated from no where #135

Closed
tengerye opened this issue Apr 3, 2018 · 7 comments
Assignees
Labels

Comments

@tengerye
Copy link
Contributor

tengerye commented Apr 3, 2018

Platform: ubuntu 16.04

Python version: Python 3.6.4 :: Anaconda

Source framework with version: MXNet 1.1.0 with GPU

Destination framework with version: Tensorflow 1.1.0 with GPU

Pre-trained model path (webpath or webdisk path): https://drive.google.com/open?id=1x0-EiYX9jMUKiq-n1Bd9OCK4fVB3a54v

Running scripts:

  1. python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d resnet50 --inputShape 3 112 112

  2. python -m mmdnn.conversion._script.IRToCode -f tensorflow --IRModelPath resnet50.pb --IRWeightPath resnet50.npy --dstModelPath tf_resnet50.py

  3. python -m mmdnn.conversion.examples.tensorflow.imagenet_test -s tensorflow -p resnet -n tf_resnet50 -w resnet50.npy

@tengerye
Copy link
Contributor Author

tengerye commented Apr 3, 2018

With the first script, I got:

[15:56:05] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.12.1. Attempting to upgrade...
[15:56:05] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
/home/yetengqi/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py:53: UserWarning: You created Module with Module(..., label_names=['softmax_label']) but input with name 'softmax_label' is not found in symbol.list_arguments(). Did you mean one of:
	data
  warnings.warn(msg)
Warning: MXNet Parser has not supported operator null with name data.
Warning: convert the null operator with name [data] into input layer.
IR network structure is saved as [resnet50.json].
IR network structure is saved as [resnet50.pb].
IR weights are saved as [resnet50.npy].

With the second script, I got:

Parse file [resnet50.pb] with binary format successfully.
TensorflowEmitter has not supported operator [_copy].
id
Target network code snippet is saved as [tf_resnet50.py].

With the third script, I got:

Traceback (most recent call last):
  File "/home/yetengqi/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/yetengqi/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/yetengqi/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 71, in <module>
    tester = TestTF()
  File "/home/yetengqi/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 20, in __init__
    self.input, self.model = self.MainModel.KitModel(self.args.w)
  File "/media/yetengqi/9e162b12-a483-4c2e-9967-0c36a024d616/home/tengqi/projects/faceRecognition/baseline/insightface/model-r50-am-lfw/tf_resnet50.py", line 26, in KitModel
    _minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='_minusscalar0_second')
  File "/home/yetengqi/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 106, in constant
    attrs={"value": tensor_value, "dtype": dtype_value}, name=name).outputs[0]
  File "/home/yetengqi/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/yetengqi/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1184, in __init__
    raise ValueError("'%s' is not a valid node name" % node_def.name)
ValueError: '_minusscalar0_second' is not a valid node name

Although I can try to correct that error, you will see another error which is caused by a variable on the line 28 named id.

@kitstar
Copy link
Contributor

kitstar commented Apr 3, 2018

Hi @tengerye,
it is because the pre-process of mxnet is not supported in Tensorflow. We will try to fix it. Thanks.

@kitstar
Copy link
Contributor

kitstar commented Apr 6, 2018

Ref issue #85. The error is fixed. But seems some accuracy problem. Ping you if there is any progress.

@kitstar kitstar added the bug label Apr 6, 2018
@kitstar kitstar changed the title MXNet to TensorFlow: a variable (named id) generated from no where [MXNet model-r50-am-lfw] MXNet to TensorFlow: a variable (named id) generated from no where Apr 6, 2018
@tengerye
Copy link
Contributor Author

tengerye commented Apr 8, 2018

Well, after updating from the source. I encounter a new error after executing python -m mmdnn.convers ion.examples.tensorflow.imagenet_test -s tensorflow -p resnet -n tf_resnet50 -w resnet50.npy and changing the variable name:

lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 71, in <module>
    tester = TestTF()
  File "lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 20, in __init__
    self.input, self.model = self.MainModel.KitModel(self.args.w)
  File "insightface/model-r50-am-lfw/tf_resnet50.py", line 259, in KitModel
    pre_fc1         = tf.layers.dense(bn1, 512, kernel_initializer = tf.constant_initializer(__weights_dict['pre_fc1']['weights']), bias_initializer = tf.constant_initializer(__weights_dict['pre_fc1']['bias']), use_bias = True)
  File "lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 248, in dense
    return layer.apply(inputs)
  File "lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 809, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 680, in __call__
    self.build(input_shapes)
  File "lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 134, in build
    trainable=True)
  File "lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 533, in add_variable
    partitioner=partitioner)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1297, in get_variable
    constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1093, in get_variable
    constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 439, in get_variable
    constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 408, in _true_getter
    use_resource=use_resource, constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 800, in _get_single_variable
    use_resource=use_resource)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2157, in variable
    use_resource=use_resource)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2147, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2130, in default_variable_creator
    constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 233, in __init__
    constraint=constraint)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 327, in _init_from_args
    initial_value(), name="initial_value", dtype=dtype)
  File "lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 784, in <lambda>
    shape.as_list(), dtype=dtype, partition_info=partition_info)
  File "lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py", line 217, in __call__
    self.value, dtype=dtype, shape=shape, verify_shape=verify_shape)
  File "lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 488, in make_tensor_proto
    (shape_size, nparray.size))
ValueError: Too many elements provided. Needed at most 262144, but received 12845056

May I ask if there still exists error?

Besides, the original error "ValueError: '_minusscalar0_second' is not a valid node name" still exists. Thank you for your kind reply.

@tengerye
Copy link
Contributor Author

tengerye commented May 9, 2018

The problem is solved according to #85 .
The following are steps to repeat in case you are still confused.

  1. Make sure your codes are up-to-date:
    pip install -U git+https://github.com/Microsoft/MMdnn.git@master

  2. Convert the model to intermediate representation format:
    python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d resnet50 --inputShape 3 112 112

  3. Convert to tensorflow code:
    python -m mmdnn.conversion._script.IRToCode -f tensorflow --IRModelPath resnet50.pb --IRWeightPath resnet50.npy --dstModelPath tf_resnet50.py

  4. Save the following codes to a file tensorflow_inference.py in your current directory:

# inference with tensorflow

from __future__ import absolute_import
import argparse
import numpy as np
from six import text_type as _text_type
from tensorflow.contrib.keras.api.keras.preprocessing import image
import tensorflow as tf


parser = argparse.ArgumentParser()

parser.add_argument('-n', type=_text_type, default='kit_imagenet',
                    help='Network structure file name.')


parser.add_argument('-w', type=_text_type, required=True,
                    help='Network weights file name')

parser.add_argument('--image', '-i',
                    type=_text_type, help='Test image path.',
                    default="mmdnn/conversion/examples/data/seagull.jpg"
)


args = parser.parse_args()
if args.n.endswith('.py'):
    args.n = args.n[:-3]

# import converted model
model_converted = __import__(args.n).KitModel(args.w)
input_tf, model_tf = model_converted

# load img with BGRTranspose=True
img = image.load_img(args.image, target_size = (112, 112))
img = image.img_to_array(img)
img = img[..., ::-1]

input_data = np.expand_dims(img, 0)

# inference with tensorflow
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    predict = sess.run(model_tf, feed_dict = {input_tf : input_data})
print(predict)
  1. Run the following shell scripts:
    python tensorflow_inference.py -n tf_resnet50.py -w resnet50.npy -i ~/Downloads/download.jpg
    Note the last string (~/Downloads/download.jpg) indicates the image to predict.

@tengerye tengerye closed this as completed May 9, 2018
@ShoufaChen
Copy link

The second step now should be:

--inputShape 3,112,112

not

 --inputShape 3 112 112

as pointed by here

@Akhp888
Copy link

Akhp888 commented Jun 8, 2020

I get "NotImplementedError" when i try aws sagemaker mxnet params

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants