-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-1209] Tutorial transpose reshape #13208
[MXNET-1209] Tutorial transpose reshape #13208
Conversation
@mxnet-label-bot add [pr-awaiting-review] |
|
||
As we can see width and height changed, by rotating pixel values by 90 degrees. Transpose does the following: | ||
|
||
<img src="https://raw.githubusercontent.com/NRauschmayr/web-data/tutorial_transpose_reshape/mxnet/doc/tutorials/basic/transpose_reshape/transpose.png" style="width:700px;height:300px;"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please reverify this image. Transposing this
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]]
returns the following -
[[ 1. 5. 9.]
[ 2. 6. 10.]
[ 3. 7. 11.]
[ 4. 8. 12.]]
But your diagram does not reflect that.
batch_size = 100 | ||
input_data = mx.random.uniform(shape=(20,100,batch_size)) | ||
reshaped = input_data.reshape(-1,batch_size) | ||
print inpout_data.shape, reshaped.shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show the result of the print statement.
reshaped = input_data.reshape(-1,batch_size) | ||
print inpout_data.shape, reshaped.shape | ||
``` | ||
The reshape function of [MXNet's NDArray API](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html?highlight=reshape#mxnet.ndarray.NDArray.reshape) allows even more advanced transformations: For instance: with -2 you copy all/remainder of the input dimensions to the output shape. With -3 reshape will use the product of two consecutive dimensions of the input shape as the output dim. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the documentation does not describe what -2
and -3
values does. Can this tutorial describe where and how those values are used with reshape. For example, is it with video data where we have 5-d arrays etc..
This was just a toy example. But such transformations are for instance done in image superresolution where you increase width and height of the input image and ```x``` would be the output of a CNN that computes an upscale feature vector. | ||
|
||
#### Check out the MXNet documentation for more details | ||
http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shows as just text, it should be a clickable link.
But a bigger point is I don't think we need this section at all. Can these two documentation links be added as a hyperlink to the first occurrence of the terms "Reshape" and "Transpose" at the beginning of the tutorial and remove this section.
You are saying that, "go to the the documentation for more details" but I think this tutorial contains more details than the documentation. :)
can you also please update your PR description with this link - http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-13208/3/tutorials/basic/reshape_transpose.html. It will help reviewers, especially when reviewing tutorials.
There are a bunch of links pointing to a personal repo instead of dmlc/web-data. Please replace those. The rest LGTM. |
It would be helpful for users to include some discussion of the common errors that they see e.g. when there's a tensor shape mismatch between layers of a NN. If we included the actual error message then people would find this tutorial when they Googled for the error message. This would segue nicely to the example where you show how you can't just perform these operations to make the error go away; you have to actually know what you're doing. For users who already know what tensor reshaping is, the information they need (the "how" rather than the "why") is in a few non-obvious places. Maybe add something near the top with pointers to ndarray docs for more comprehensive documentation: https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html Some of the cases of users asking about mismatch errors on the discussion forum might give good background for the confusion of people who are new to such operations. |
I added more examples e.g. common pitfalls and errors |
@@ -0,0 +1,190 @@ | |||
|
|||
## Difference between reshape and transpose operators | |||
Modyfing the shape of tensors is a very common operation in Deep Learning. For instance, when using pretrained neural networks it is often required to adjust input data dimensions to correspond to what the network has been trained on, e.g. tensors of shape `[batch_size, channels, width, height]`. This notebook discusses briefly the difference between the operators [Reshape](http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape) and [Transpose](http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.transpose). Both allow to change the shape, however they are not the same and are commonly mistaken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modyfing -> Modifying
often required to adjust input data dimension --> often necessary to adjust the input data dimension
Both allow to --> Both allow you to
* height: 200 pixels | ||
* colors: 3 (RGB) | ||
|
||
Now lets reshape the image in order to exchange width and height dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width and height dimensions
![png](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/basic/transpose_reshape/reshaped_image.png) <!--notebook-skip-line--> | ||
|
||
|
||
As we can see the first and second dimensions have changed. However the image can't be identified as cat anylonger. In order to understand what happened, let's have a look at the image below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any longer
|
||
<img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/basic/transpose_reshape/transpose.png" style="width:700px;height:300px;"> | ||
|
||
As shown in the diagram, the axis have been flipped: pixel values that have been in the first row are now in the first column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the axis have --> the axes have
that have been --> that were
## When to transpose/reshape with MXNet | ||
In this chapter we discuss when transpose and reshape is used in MXNet. | ||
#### Channel first for images | ||
Images are usually stored in the format height, wight, channel. When working with [convolutional](https://mxnet.incubator.apache.org/api/python/gluon/nn.html#mxnet.gluon.nn.Conv1D) layers, MXNet expects the layout to be `NCHW` (batch, channel, height, width). MXNet uses this layout because of performance reasons on the GPU. Consequently, images need to be transposed to have the right format. For instance, you may have a function like the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth calling out that this channel ordering is different from TF?
``` | ||
(1, 999, 128) <!--notebook-skip-line--> | ||
#### Advanced reshaping with MXNet ndarrays | ||
It is sometimes useful to automatically infer the shape of tensors. Especially when you deal with very deep neural networks, it may not always be clear what the shape of a tensor is after a specific layer. For instance you may want the tensor to be two-dimensional where one dimension is the known batch_size. With ```mx.nd.array(-1, batch_size)``` the first dimension will be automatically inferred. Here a simplified example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here a simplified --> Here is a simplified
[...] | ||
|
||
``` | ||
This is happening when you your data does not have the shape ```[batch_size, channel, width, height]``` e.g. your data may be a one-dimensional vector or when the color channel may be the last dimension instead of the second one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is happening --> This happens
LGTM. @NRauschmayr Could you trigger the CI again? @simoncorstonoliver @ThomasDelteil for another round of review |
@@ -1,7 +1,6 @@ | |||
## Difference between reshape and transpose operators | |||
|
|||
What does it mean if MXNet gives you an error like the this? | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @NRauschmayr , There's also another way of pushing an empty commit in case you need to re-trigger the CI.
You can try git commit --allow-empty -m "<commit-message>"
:)
Can we merge? |
@NRauschmayr checking with @marcoabreu why the build status was not propagated back to the PR status check, which is necessary for merging |
Any update on this? |
Description
Adding a tutorial that explains the difference between reshape and transpose operators
@ThomasDelteil @Ishitori can you please have a look? Thanks!
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-13208/3/tutorials/basic/reshape_transpose.html
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.