Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

block.export bug #17981

Open
chinakook opened this issue Apr 6, 2020 · 5 comments
Open

block.export bug #17981

chinakook opened this issue Apr 6, 2020 · 5 comments

Comments

@chinakook
Copy link
Contributor

chinakook commented Apr 6, 2020

Description

net.hybridize may optimize out some ops. These ops are alive in nn.Block(also nn.HybridBlock), but its names are not contained in symbol's arg_names list. So ignore these ops except that their name are end with 'running_mean' or 'running_var'.
To fix this, please refer to #17970

Error Message

/home/xxxxx/dev/mx/python/mxnet/gluon/block.py:698: UserWarning: Parameter conv3_weight, conv3_bias is not used by any computation. Is this intended?
  out = self.forward(*args)
Traceback (most recent call last):
  File "/home/xxxxx/dev/U-Net/linux_scripts/little_test.py", line 39, in <module>
    net.export('bar')
  File "/home/xxxxx/dev/mx/python/mxnet/gluon/block.py", line 1274, in export
    assert name in aux_names
AssertionError 

To Reproduce

import mxnet as mx
from mxnet import gluon
from mxnet.gluon import nn

class Foo(nn.HybridBlock):
    def __init__(self):
        super(Foo, self).__init__()
        self.conv0 = nn.Conv2D(4, 1)
        self.conv1 = nn.Conv2D(6, 1)
    
    def hybrid_forward(self, F, x):
        x = self.conv0(x)
        y = self.conv1(x)
        return tuple([x,y])

foo = Foo()
foo.collect_params().initialize()

x = mx.nd.random.uniform(shape=(1,3,64,64))
y = foo(x)
foo.save_parameters('foo.params')

class Bar(nn.HybridBlock):
    def __init__(self):
        super(Bar, self).__init__()
        self.foo = Foo()
        self.foo.load_parameters('foo.params')
    
    def hybrid_forward(self, F, x):
        return self.foo(x)[0]

net = Bar()
net.hybridize()
x = mx.nd.random.uniform(shape=(1,3,64,64))
y = net(x)
net.export('bar')

Steps to reproduce

(Paste the commands you ran that produced the error.)

What have you tried to solve it?

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

# paste outputs here
@chinakook chinakook added the Bug label Apr 6, 2020
@sxjscience
Copy link
Member

I can reproduce the error.

@sxjscience
Copy link
Member

@chinakook Thanks for reporting this. I just simplified the example as follows. The root cause is that conv1 is not used in the computation of Bar but still appears in .collect_params(). So the symbolic saving breaks here:
https://github.com/apache/incubator-mxnet/blob/e3493e7b47ddcaa6974280ee432c82eb89d0f756/python/mxnet/gluon/block.py#L1281-L1287

import mxnet as mx
from mxnet import gluon
from mxnet.gluon import nn
mx.npx.set_np()

class Foo(nn.HybridBlock):
    def __init__(self):
        super(Foo, self).__init__()
        self.conv0 = nn.Conv2D(4, 1, in_channels=3)
        self.conv1 = nn.Conv2D(6, 1, in_channels=4)
    
    def hybrid_forward(self, F, x):
        x = self.conv0(x)
        y = self.conv1(x)
        return tuple([x,y])

class Bar(nn.HybridBlock):
    def __init__(self):
        super(Bar, self).__init__()
        with self.name_scope():
            self.foo = Foo()
    
    def hybrid_forward(self, F, x):
        return self.foo(x)[0]

net = Bar()
net.hybridize()
net.initialize()
x = mx.np.random.uniform(0, 1, (1,3,64,64))
y = net(x)
print(y)
net.export('bar')

Error:

~/.local/lib/python3.6/site-packages/mxnet/gluon/block.py in export(self, path, epoch, remove_amp_cast)
   1284                 arg_dict['arg:%s'%name] = param._reduce()
   1285             else:
-> 1286                 assert name in aux_names
   1287                 arg_dict['aux:%s'%name] = param._reduce()
   1288         save_fn = _mx_npx.save if is_np_array() else ndarray.save

AssertionError: 

@sxjscience
Copy link
Member

@leezu @yzhliu I'm tagging it as gluon, numpy and 2.0 because it's the bug of Gluon and also appears in the numpy interface and should be solved in 2.0.

@chinakook
Copy link
Contributor Author

@sxjscience I made a fix as #17970. It may be ugly, but it works fine for me. It would be appreciated if you have better solution.

@yzhliu yzhliu added the WIP label Jun 11, 2020
@leezu leezu mentioned this issue Aug 14, 2020
7 tasks
@leezu
Copy link
Contributor

leezu commented Aug 31, 2020

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants