Update tutorials (apache#18609)

Update docs according to new Block APIs (apache#18413)
chinakook · Nov 19, 2020 · 4d614bd · 4d614bd
1 parent 3ead83b
commit 4d614bd
Show file tree

Hide file tree

Showing 23 changed files with 194 additions and 287 deletions.
diff --git a/docs/python_docs/python/api/gluon/index.rst b/docs/python_docs/python/api/gluon/index.rst
@@ -33,10 +33,9 @@ one input layer, one hidden layer, and one output layer.
    # When instantiated, Sequential stores a chain of neural network layers.
    # Once presented with data, Sequential executes each layer in turn, using
    # the output of one layer as the input for the next
-   with net.name_scope():
-       net.add(gluon.nn.Dense(256, activation="relu")) # 1st layer (256 nodes)
-       net.add(gluon.nn.Dense(256, activation="relu")) # 2nd hidden layer
-       net.add(gluon.nn.Dense(num_outputs))
+   net.add(gluon.nn.Dense(256, activation="relu")) # 1st layer (256 nodes)
+   net.add(gluon.nn.Dense(256, activation="relu")) # 2nd hidden layer
+   net.add(gluon.nn.Dense(num_outputs))
 
 
 .. automodule:: mxnet.gluon

diff --git a/docs/python_docs/python/tutorials/extend/custom_layer.md b/docs/python_docs/python/tutorials/extend/custom_layer.md
@@ -111,10 +111,9 @@ Below is an example of how to create a simple neural network with a custom layer
 
 ```python
 net = gluon.nn.HybridSequential()                         # Define a Neural Network as a sequence of hybrid blocks
-with net.name_scope():                                    # Used to disambiguate saving and loading net parameters
-    net.add(Dense(5))                                     # Add Dense layer with 5 neurons
-    net.add(NormalizationHybridLayer())                   # Add our custom layer
-    net.add(Dense(1))                                     # Add Dense layer with 1 neurons
+net.add(Dense(5))                                     # Add Dense layer with 5 neurons
+net.add(NormalizationHybridLayer())                   # Add our custom layer
+net.add(Dense(1))                                     # Add Dense layer with 1 neurons
 
 
 net.initialize(mx.init.Xavier(magnitude=2.24))            # Initialize parameters of all layers
@@ -148,12 +147,11 @@ class NormalizationHybridLayer(gluon.HybridBlock):
     def __init__(self, hidden_units, scales):
         super(NormalizationHybridLayer, self).__init__()
 
-        with self.name_scope():
-            self.weights = self.params.get('weights',
-                                           shape=(hidden_units, 0),
-                                           allow_deferred_init=True)
+        self.weights = gluon.Parameter('weights',
+                                       shape=(hidden_units, 0),
+                                       allow_deferred_init=True)
 
-            self.scales = self.params.get('scales',
+        self.scales = gluon.Parameter('scales',
                                       shape=scales.shape,
                                       init=mx.init.Constant(scales.asnumpy().tolist()), # Convert to regular list to make this object serializable
                                       differentiable=False)
@@ -170,14 +168,13 @@ In the example above 2 set of parameters are defined:
 1. Parameter `scale` is a constant that doesn't change. Its shape is defined during construction.
 
 Notice a few aspects of this code:
-* `name_scope()` method is used to add a prefix to parameter names during saving and loading
 * Shape is not provided when creating `weights`. Instead it is going to be infered from the shape of the input
 * `Scales` parameter is initialized and marked as `differentiable=False`.
 * `F` backend is used for all calculations
 * The calculation of dot product is done using `F.FullyConnected()` method instead of `F.dot()` method. The one was chosen over another because the former supports automatic infering shapes of inputs while the latter doesn't. This is extremely important to know, if one doesn't want to hard code all the shapes. The best way to learn what operators supports automatic inference of input shapes at the moment is browsing C++ implementation of operators to see if one uses a method `SHAPE_ASSIGN_CHECK(*in_shape, fullc::kWeight, Shape2(param.num_hidden, num_input));`
 * `hybrid_forward()` method signature has changed. It accepts two new arguments: `weights` and `scales`.
 
-The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to a `Symbol` depending on if the layer was hybridized. One shouldn't use `self.weights` and `self.scales` or `self.params.get` in `hybrid_forward` except to get shapes of parameters. 
+The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to a `Symbol` depending on if the layer was hybridized. One shouldn't use `self.weights` and `self.scales` in `hybrid_forward` except to get shapes of parameters. 
 
 Running forward pass on this network is very similar to the previous example, so instead of just doing one forward pass, let's run whole training for a few epochs to show that `scales` parameter doesn't change during the training while `weights` parameter is changing.
 
@@ -194,11 +191,10 @@ def print_params(title, net):
         print('{} = {}\n'.format(key, value.data()))
 
 net = gluon.nn.HybridSequential()                             # Define a Neural Network as a sequence of hybrid blocks
-with net.name_scope():                                        # Used to disambiguate saving and loading net parameters
-    net.add(Dense(5))                                         # Add Dense layer with 5 neurons
-    net.add(NormalizationHybridLayer(hidden_units=5, 
-                                     scales = nd.array([2]))) # Add our custom layer
-    net.add(Dense(1))                                         # Add Dense layer with 1 neurons
+net.add(Dense(5))                                         # Add Dense layer with 5 neurons
+net.add(NormalizationHybridLayer(hidden_units=5, 
+                                    scales = nd.array([2]))) # Add our custom layer
+net.add(Dense(1))                                         # Add Dense layer with 1 neurons
 
 
 net.initialize(mx.init.Xavier(magnitude=2.24))                # Initialize parameters of all layers

diff --git a/docs/python_docs/python/tutorials/extend/customop.md b/docs/python_docs/python/tutorials/extend/customop.md
@@ -197,7 +197,7 @@ class DenseBlock(mx.gluon.Block):
     def __init__(self, in_channels, channels, bias, **kwargs):
         super(DenseBlock, self).__init__(**kwargs)
         self._bias = bias
-        self.weight = self.params.get('weight', shape=(channels, in_channels))
+        self.weight = gluon.Parameter('weight', shape=(channels, in_channels))
 
     def forward(self, x):
         ctx = x.context

diff --git a/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md b/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md
@@ -82,7 +82,7 @@ net.add(nn.Conv2D(channels=6, kernel_size=5, activation='relu'),
         nn.Dense(10))
 ```
 
-And then load the saved parameters into GPU 0 directly, or use `net.collect_params().reset_ctx` to change the device.
+And then load the saved parameters into GPU 0 directly, or use `net.reset_ctx` to change the device.
 
 ```{.python .input  n=20}
 net.load_parameters('net.params', ctx=gpu(0))
@@ -120,7 +120,7 @@ The training loop is quite similar to what we introduced before. The major diffe
 # Diff 1: Use two GPUs for training.
 devices = [gpu(0), gpu(1)]
 # Diff 2: reinitialize the parameters and place them on multiple GPUs
-net.collect_params().initialize(force_reinit=True, ctx=devices)
+net.initialize(force_reinit=True, ctx=devices)
 # Loss and trainer are the same as before
 softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
 trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1})

diff --git a/...on_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md b/...on_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md
@@ -170,8 +170,7 @@ Before we go to training, one unique Gluon feature you should be aware of is hyb
 finetune_net = resnet50_v2(pretrained=True, ctx=ctx)
 
 # change last softmax layer since number of classes are different
-with finetune_net.name_scope():
-    finetune_net.output = nn.Dense(classes)
+finetune_net.output = nn.Dense(classes)
 finetune_net.output.initialize(init.Xavier(), ctx=ctx)
 # hybridize for better performance
 finetune_net.hybridize()

diff --git a/docs/python_docs/python/tutorials/getting-started/logistic_regression_explained.md b/docs/python_docs/python/tutorials/getting-started/logistic_regression_explained.md
@@ -80,11 +80,10 @@ Below, we define a model which has an input layer of 10 neurons, a couple of inn
 ```python
 net = nn.HybridSequential()
 
-with net.name_scope():
-    net.add(nn.Dense(units=10, activation='relu'))  # input layer
-    net.add(nn.Dense(units=10, activation='relu'))   # inner layer 1
-    net.add(nn.Dense(units=10, activation='relu'))   # inner layer 2
-    net.add(nn.Dense(units=1))   # output layer: notice, it must have only 1 neuron
+net.add(nn.Dense(units=10, activation='relu'))  # input layer
+net.add(nn.Dense(units=10, activation='relu'))   # inner layer 1
+net.add(nn.Dense(units=10, activation='relu'))   # inner layer 2
+net.add(nn.Dense(units=1))   # output layer: notice, it must have only 1 neuron
 
 net.initialize(mx.init.Xavier())
 ```

diff --git a/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md b/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md
@@ -342,13 +342,12 @@ Apache MXNet uses lazy evaluation to achieve superior performance. The Python th
 
 ## PyTorch module and Gluon blocks
 
-### For new block definition, gluon needs name_scope
+### For new block definition, gluon is similar to PyTorch
 
-`name_scope` coerces Gluon to give each parameter an appropriate name, indicating which model it belongs to.
 
 | Function               | PyTorch                           | MXNet Gluon                                                                |
 |------------------------|-----------------------------------|----------------------------------------------------------------------------|
-| New block definition   | `class Net(torch.nn.Module):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def __init__(self, D_in, D_out):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`super(Net, self).__init__()`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`self.linear = torch.nn.Linear(D_in, D_out)`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def forward(self, x):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`return self.linear(x)`       |    `class Net(mx.gluon.Block):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def __init__(self, D_in, D_out):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`super(Net, self).__init__()`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`with self.name_scope():`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`self.dense=mx.gluon.nn.Dense(D_out, in_units=D_in)`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def forward(self, x):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`return self.dense(x)`      |
+| New block definition   | `class Net(torch.nn.Module):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def __init__(self, D_in, D_out):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`super(Net, self).__init__()`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`self.linear = torch.nn.Linear(D_in, D_out)`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def forward(self, x):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`return self.linear(x)`       |    `class Net(mx.gluon.Block):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def __init__(self, D_in, D_out):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`super(Net, self).__init__()`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`self.dense=mx.gluon.nn.Dense(D_out, in_units=D_in)`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`def forward(self, x):`<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`return self.dense(x)`      |
 
 ### Parameter and Initializer
 
@@ -374,15 +373,15 @@ Instead of explicitly declaring the number of inputs to a layer, we can simply s
 
 | Function               | PyTorch                           | MXNet Gluon                                                                |
 |------------------------|-----------------------------------|----------------------------------------------------------------------------|
-| partial-shape  <br/> hybridized    |  Not Available   |  `net = mx.gluon.nn.HybridSequential()`<br/>`with net.name_scope():`<br/>&nbsp;&nbsp;&nbsp;&nbsp;`net.add(mx.gluon.nn.Dense(10))`<br/>`net.hybridize()`   |
+| partial-shape  <br/> hybridized    |  Not Available   |  `net = mx.gluon.nn.HybridSequential()`<br/>`net.add(mx.gluon.nn.Dense(10))`<br/>`net.hybridize()`   |
 
 ### SymbolBlock
 
 SymbolBlock can construct block from symbol. This is useful for using pre-trained models as feature extractors.
 
 | Function               | PyTorch                           | MXNet Gluon                                                                |
 |------------------------|-----------------------------------|----------------------------------------------------------------------------|
-|  SymbolBlock    |  Not Available   |  `alexnet = mx.gluon.model_zoo.vision.alexnet(pretrained=True, prefix='model_')`<br/>`out = alexnet(inputs)`<br/>`internals = out.get_internals()`<br/>`outputs = [internals['model_dense0_relu_fwd_output']]`<br/>`feat_model = gluon.SymbolBlock(outputs, inputs, params=alexnet.collect_params())`   |
+|  SymbolBlock    |  Not Available   |  `alexnet = mx.gluon.model_zoo.vision.alexnet(pretrained=True)`<br/>`out = alexnet(inputs)`<br/>`internals = out.get_internals()`<br/>`outputs = [internals['model_dense0_relu_fwd_output']]`<br/>`feat_model = gluon.SymbolBlock(outputs, inputs, params=alexnet.collect_params())`   |
 
 ## PyTorch optimizer vs Gluon Trainer
 ### For Gluon API calling zero_grad is not necessary most of the time

diff --git a/docs/python_docs/python/tutorials/packages/gluon/blocks/custom-layer.md b/docs/python_docs/python/tutorials/packages/gluon/blocks/custom-layer.md
@@ -91,8 +91,8 @@ class MyDense(nn.Block):
         # in_units: the number of inputs in this layer
 
         super(MyDense, self).__init__(**kwargs)
-        self.weight = self.params.get('weight', shape=(in_units, units))
-        self.bias = self.params.get('bias', shape=(units,))
+        self.weight = gluon.Parameter('weight', shape=(in_units, units))
+        self.bias = gluon.Parameter('bias', shape=(units,))
 
     def forward(self, x):
         linear = nd.dot(x, self.weight.data()) + self.bias.data()

diff --git a/docs/python_docs/python/tutorials/packages/gluon/blocks/custom_layer_beginners.md b/docs/python_docs/python/tutorials/packages/gluon/blocks/custom_layer_beginners.md
@@ -102,10 +102,9 @@ Below is an example of how to create a simple neural network with a custom layer
 
 ```python
 net = gluon.nn.HybridSequential()                         # Define a Neural Network as a sequence of hybrid blocks
-with net.name_scope():                                    # Used to disambiguate saving and loading net parameters
-    net.add(Dense(5))                                     # Add Dense layer with 5 neurons
-    net.add(NormalizationHybridLayer())                   # Add our custom layer
-    net.add(Dense(1))                                     # Add Dense layer with 1 neurons
+net.add(Dense(5))                                     # Add Dense layer with 5 neurons
+net.add(NormalizationHybridLayer())                   # Add our custom layer
+net.add(Dense(1))                                     # Add Dense layer with 1 neurons
 
 
 net.initialize(mx.init.Xavier(magnitude=2.24))            # Initialize parameters of all layers
@@ -134,12 +133,11 @@ class NormalizationHybridLayer(gluon.HybridBlock):
     def __init__(self, hidden_units, scales):
         super(NormalizationHybridLayer, self).__init__()
 
-        with self.name_scope():
-            self.weights = self.params.get('weights',
-                                           shape=(hidden_units, 0),
-                                           allow_deferred_init=True)
+        self.weights = gluon.Parameter('weights',
+                                       shape=(hidden_units, 0),
+                                       allow_deferred_init=True)
 
-            self.scales = self.params.get('scales',
+        self.scales = gluon.Parameter('scales',
                                       shape=scales.shape,
                                       init=mx.init.Constant(scales.asnumpy()),
                                       differentiable=False)
@@ -157,14 +155,13 @@ In the example above 2 set of parameters are defined:
 
 Notice a few aspects of this code:
 
-+ `name_scope()` method is used to add a prefix to parameter names during saving and loading
 + Shape is not provided when creating `weights`. Instead it is going to be infered from the shape of the input
 + `Scales` parameter is initialized and marked as `differentiable=False`.
 + `F` backend is used for all calculations
 + The calculation of dot product is done using `F.FullyConnected()` method instead of `F.dot()` method. The one was chosen over another because the former supports automatic infering shapes of inputs while the latter doesn’t. This is extremely important to know, if one doesn’t want to hard code all the shapes. The best way to learn what operators supports automatic inference of input shapes at the moment is browsing C++ implementation of operators to see if one uses a method `SHAPE_ASSIGN_CHECK(*in_shape, fullc::kWeight, Shape2(param.num_hidden, num_input));`
 + `hybrid_forward()` method signature has changed. It accepts two new arguments: `weights` and `scales`.
 
-The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to a `Symbol` depending on if the layer was hybridized. One shouldn’t use `self.weights` and `self.scales` or `self.params.get` in `hybrid_forward` except to get shapes of parameters.
+The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to a `Symbol` depending on if the layer was hybridized. One shouldn’t use `self.weights` and `self.scales` in `hybrid_forward` except to get shapes of parameters.
 
 Running forward pass on this network is very similar to the previous example, so instead of just doing one forward pass, let’s run whole training for a few epochs to show that `scales` parameter doesn’t change during the training while `weights` parameter is changing.
 
@@ -180,11 +177,10 @@ def print_params(title, net):
         print('{} = {}\n'.format(key, value.data()))
 
 net = gluon.nn.HybridSequential()                             # Define a Neural Network as a sequence of hybrid blocks
-with net.name_scope():                                        # Used to disambiguate saving and loading net parameters
-    net.add(Dense(5))                                         # Add Dense layer with 5 neurons
-    net.add(NormalizationHybridLayer(hidden_units=5,
-                                     scales = nd.array([2]))) # Add our custom layer
-    net.add(Dense(1))                                         # Add Dense layer with 1 neurons
+net.add(Dense(5))                                         # Add Dense layer with 5 neurons
+net.add(NormalizationHybridLayer(hidden_units=5,
+                                    scales = nd.array([2]))) # Add our custom layer
+net.add(Dense(1))                                         # Add Dense layer with 1 neurons
 
 
 net.initialize(mx.init.Xavier(magnitude=2.24))                # Initialize parameters of all layers