Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

add naming tutorial #10511

Merged
merged 9 commits into from
Apr 13, 2018
Merged

add naming tutorial #10511

merged 9 commits into from
Apr 13, 2018

Conversation

piiswrong
Copy link
Contributor

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@piiswrong piiswrong requested a review from szha as a code owner April 11, 2018 19:50
dense0_


When you create more Blocks of the same kind, they will be named differetly to avoid collision:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

differently

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to mention that number appended to the name would be incremented to avoid collision ?



```python
model0 = Model()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use names like zeroth_model and first_model to get the point across.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think model0 is fine. It matches the prefix

```

Dense(None -> 100, linear) alexnet0_dense3_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add <!-- INSERT SOURCE DOWNLOAD BUTTONS --> to enable the notebook download of your tutorial?


For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.

To see how to do this, we first load an pretrained alexnet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a pretrained*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe alexnet if you want to refer to it now as an object instead of a model from the zoo, where it would be AlexNet.


```python
with alexnet.name_scope():
alexnet.output = gluon.nn.Dense(100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth putting here that this works only because the models in model zoo are inheriting HybridBlock and have been built in a way so that their final output layer is associated with the .output property.

I can see people building their own network usingHybridSequential for example and trying to update the output layer by calling output on their network.

@ThomasDelteil
Copy link
Contributor

ThomasDelteil commented Apr 12, 2018

Great tutorial, it was much needed and clarifies what is going on under the hood with the naming scopes 👍

  • " the PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes) " 😄

Copy link
Contributor

@aaronmarkham aaronmarkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some minor changes requested and some suggestions for clarity.


To manage the names of nested Blocks, each Block has a `name_scope` attached to it. All Blocks created within a name scope will have its parent Block's prefix prepended to its name.

Let's demonstrate this by first define a simple neural net:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defining

model0 = Model()
model0.initialize()
model0(mx.nd.zeros((1, 20)))
print(model0.prefix, model0.dense0.prefix, model0.dense1.prefix, model0.mydense.prefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This print all runs together, and... maybe a I'm a little dense ;) but I'm not quite catching the point. Plus I have a hard time breaking it up when the Jupyter font uses an 'l' that looks like a '1', so I see mode10, not model0, and run together it's incomprehensible.

model1_ model1_dense0_ model1_dense1_ model1_mydense_


**It is recommended that you manually specify prefix for the top level Block (i.e. `model = Model(prefix='mymodel_')`) to avoid potential confusions in naming**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify a prefix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a period at the end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why not show that code sample if it's so important?


**It is recommended that you manually specify prefix for the top level Block (i.e. `model = Model(prefix='mymodel_')`) to avoid potential confusions in naming**

The same principle also applies to container blocks like Sequantial. `name_scope` can be used inside `__init__` as well as out side of `__init__`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling... Sequential



```python
mydense = gluon.nn.Dense(100, prefix='mydense_')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also choose to not use an underscore and name it what you want. Maybe explain why you want to use an underscore.

)


As a result if you try to save parameters from model0 and load it with model1, you'll get an error due to unmatching names:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result,

model0.collect_params().save('model.params')
try:
model1.collect_params().load('model.params', mx.cpu())
except Exception, e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not python 3 compatible syntax , --> as


Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.

For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the AlexNet model from the model zoo (add link?)

1,000


For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.

To see how to do this, we first load an pretrained alexnet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe alexnet if you want to refer to it now as an object instead of a model from the zoo, where it would be AlexNet.


To change the output to 100 dimension, we replace it with a new block.

- Note that it's important to do this in alexnet's name_scope, otherwise you will have unmatching names when you try to save and load your model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into similar issues and just blanked out the prefix. Here's an example that I used.
https://gist.github.com/aaronmarkham/1017664fe683596c614961112a867145#gistcomment-2336981
Can you explain a bit more? This is all really helpful, and I wonder how you'd explain what I ran into and better ways to fix it.

Like how I used ignore_extra to get around the errors but was still worried that I hadn't really loaded the model properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a fix so that it won't be a problem anymore

name=name, type1=type(existing), type2=type(value)))

if isinstance(value, Block):
self.register_child(value, name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the block update logic can cause having stale blocks in _children.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_children is OrderedDict now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed. This means it's finally possible to print the model repr in order.

@piiswrong piiswrong merged commit b929892 into apache:master Apr 13, 2018
@zhanghang1989
Copy link
Contributor

Still have problem save and load. #10544

rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
* add naming tutorial

* fix

* Update naming.md

* Update index.md

* fix save load

* fix

* fix

* fix

* fix
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* add naming tutorial

* fix

* Update naming.md

* Update index.md

* fix save load

* fix

* fix

* fix

* fix
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants