-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] Convert Layout pass. #4664
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a read through and think its pretty good overall. I only have a little experience with layout passes and I suspect I'll write a few for ops eventually so I think I'm a reasonable representation of the target audience for this tutorial. I left a few questions in Section 3 that I think if addressed would improve its clarity.
docs/dev/convert_layout.rst
Outdated
|
||
if data_layout == 'NHWC' and kernel_layout == 'HWIO': | ||
# Convert (NHWC, HWIO) to (NCHW, OIHW) | ||
return relay.nn.conv2d(data, weight, **new_attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the actual conversion missing? Both cases return the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it looks confusing. I think it would be clearer if the if
statements were removed and a comment on new_attrs['data_layout'] = desired_layout, new_attrs['kernel_layout'] = 'OIHW'
was added stating that the new layout will be detected by the pass and transforms will be inserted automatically.
docs/dev/convert_layout.rst
Outdated
|
||
|
||
|
||
**Automatic insertion of layout transforms** - Depending on inferred layouts, this component automatically inserts layout transforms at the input expr of the operator. This happens for *layout-agnostic* operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there should be a code snippet for this component.
// BN has 5 inputs, 3 outputs. The last 4 inputs and last 2 outputs have "C" layout. | ||
Layout c_layout = Layout("C"); | ||
|
||
return Array<Array<Layout>>{{ret, c_layout, c_layout, c_layout, c_layout}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a description of what this return value represents.
ret = old_in_layouts[0]; | ||
} | ||
// BN has 5 inputs, 3 outputs. The last 4 inputs and last 2 outputs have "C" layout. | ||
Layout c_layout = Layout("C"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a C layout?
ret = new_in_layouts[0]; | ||
} else if (old_in_layouts.defined()) { | ||
ret = old_in_layouts[0]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If old_in_layouts
isn't defined, ret is just left undefined. Will that cause an error or does it somehow work itself out?
if (new_in_layouts.defined() && old_in_layouts.defined()) { | ||
// Get the new C axis. Extract the dim in old layout. Find the index of that dim in next layout. | ||
const auto& bn_dim = old_in_layouts[0][axis]; | ||
auto new_index = new_in_layouts[0].IndexOf(bn_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code might be clearer if the tutorial discussed how Layout
objects works a little. In this line, are we searching a layout string to find which axis matches the corresponding old layout dim?
docs/dev/convert_layout.rst
Outdated
|
||
**Layout inference** - Relay op has an attribute - *FInferCorrectLayout* - that developers can implement to handle data layouts. Currently, this attribute is only exposed in C++. This function takes original input layouts and the new input layouts (passed from the previous operator or from the python callback for layout alteration). A TVM developer can use this function to infer the final data layout and also modify the op attributes if needed. | ||
|
||
This component is used for *lightly-layout sensitive* operators. We try to accept the new input layout, and modify the current operator attributes (like axis for concatenate, pad_width for pad) to adapt to the new data layout. By accepting the new input data layout, we prevent the insertion of a layout transform. In absence of this function, Layout rewrite might have to insert a layout transform, if the previous operator has a different output data layout than the original one. One example to adapt to NCHW data layout is presented here for Batch Norm operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do heavily-sensitive operators like conv2d also require this component? If so, maybe both implementations could be shown and discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more precisely, modifying attributes is used for lightly-layout sensitive operators. layout inference is required by every operator.
Thanks for the quick review. Yes, your comments make sense. I will rewrite section 3 to get those ideas in. |
docs/dev/convert_layout.rst
Outdated
2. Motivation | ||
************* | ||
|
||
Lets look at a simple scenario to understand the complications that arise due to different layouts - Suppose we want to compile a Tensorflow NHWC graph for an ARM edge device. But, suppose we currently support only NCHW schedules in TOPI for ARM. So, there is a mismatch between framework layout and TOPI-supported layout. One way to deal with this mismatch is to insert layout transforms before each and after convolution, such that resulting convolution has NCHW input data layout and can use TOPI schedules. However, this can lead to performance degradation because of the presence of too many layout transforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets -> Let's
docs/dev/convert_layout.rst
Outdated
- No way to run TFLite graphs on Nvidia GPUs. TOPI has NCHW-only schedules for GPUs. | ||
- Ever-complicating logic in AlterOpLayout for convolution to support different pairs of layout transformations. | ||
- Sub-optimal performance for TF graphs due to extra layout transforms. | ||
- Complication in third-party codegen integrations like TRT that prefers data layout to be in one format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Complication in third-party codegen integrations like TRT that prefers data layout to be in one format. | |
- Complication in third-party codegen integrations like TensorRT that prefers data layout to be in one format. |
docs/dev/convert_layout.rst
Outdated
- Sub-optimal performance for TF graphs due to extra layout transforms. | ||
- Complication in third-party codegen integrations like TRT that prefers data layout to be in one format. | ||
|
||
To solve these problems, we introduced *ConvertLayout* pass that sets up the infrastructure to change the data layout of the whole graph with minimal number of data layout transforms. In ideal cases, we will have only 2 layout transforms, one at the start and one at the end. An example to show the transformation is below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To solve these problems, we introduced *ConvertLayout* pass that sets up the infrastructure to change the data layout of the whole graph with minimal number of data layout transforms. In ideal cases, we will have only 2 layout transforms, one at the start and one at the end. An example to show the transformation is below | |
To solve these problems, we introduced *ConvertLayout* pass that sets up the infrastructure to change the data layout of the whole graph with minimal number of data layout transforms. In ideal cases, we will have only 2 layout transforms for data, one at the start and one at the end. An example to show the transformation is below |
docs/dev/convert_layout.rst
Outdated
3. Design | ||
********* | ||
|
||
ConvertLayout pass is heavily built upon Relay layout rewriter infrastructure. To understand the design, lets break the operators into 3 categories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConvertLayout pass is heavily built upon Relay layout rewriter infrastructure. To understand the design, lets break the operators into 3 categories | |
ConvertLayout pass is heavily built upon Relay layout rewriter infrastructure. To understand the design, let's break the operators into 3 categories |
docs/dev/convert_layout.rst
Outdated
|
||
**Layout inference** - Relay op has an attribute - *FInferCorrectLayout* - that developers can implement to handle data layouts. Currently, this attribute is only exposed in C++. This function takes original input layouts and the new input layouts (passed from the previous operator or from the python callback for layout alteration). A TVM developer can use this function to infer the final data layout and also modify the op attributes if needed. | ||
|
||
This component is used for *lightly-layout sensitive* operators. We try to accept the new input layout, and modify the current operator attributes (like axis for concatenate, pad_width for pad) to adapt to the new data layout. By accepting the new input data layout, we prevent the insertion of a layout transform. In absence of this function, Layout rewrite might have to insert a layout transform, if the previous operator has a different output data layout than the original one. One example to adapt to NCHW data layout is presented here for Batch Norm operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more precisely, modifying attributes is used for lightly-layout sensitive operators. layout inference is required by every operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Section is much clearer now, thanks for making the improvements! LGTM.
docs/dev/convert_layout.rst
Outdated
|
||
These steps happen for each operator in sequence, where ConvertLayout pass keeps on passing the new layouts to the next operator properties, finally resulting in modifying the whole graph operator-by-operator. Now, let's look at a couple of examples of how to define the two properties. | ||
|
||
**FTVMConvertLayout - Python callback for layout alteration** - This is used for *heavily-layout sensitive* operators. For example, one can return a new convolution operator with new data and kernel layout. The other 2 components will infer layout and insert layout transforms if needed. One example for convolution operator is follows where we converting to NCHW layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "One example for the convolution operator follows where we are converting to NCHW layout."
docs/dev/convert_layout.rst
Outdated
|
||
- Run FTVMConvertLayout property - This allows the developers to transform the original Relay expr into a new Relay expr with new layouts, allowing user-defined layout alteration. There is a python callback for developer's ease. This is used only for heavily-layout sensitive operators. | ||
- Run FTVMInferCorretLayout property - We can view this as layout inference. It looks at the original input layout and the new input layouts, which are either coming from previous operator or from the FTVMConvertLayout modified expr (if it was used). This can be used by lightly-layout sensitive operators to adapt its attributes to new data layouts. Layout inference happens for each operator. | ||
- Automatic insertion of layout transforms - The previos step - layout inference - sets the new layout for the input exprs. If these layouts are different from the original layouts, then this component automatically inserts a layout transform. Therefore, a developer does not need to do anything for this component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previos -> previous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look good to me
docs/dev/convert_layout.rst
Outdated
} | ||
|
||
# After ConvertLayout - For data, there is a transform at the start and at the end. | ||
# For weights, there are transforms to adapt to NCHW layout. These will be removed with FoldConstant pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# For weights, there are transforms to adapt to NCHW layout. These will be removed with FoldConstant pass. | |
# For weights, there are transforms to adapt to NCHW layout. These will be removed by FoldConstant pass. |
1f5cb6c
to
b036c21
Compare
Thanks @anijain2305 @jwfromm |
* [Docs] Convert Layout pass. * Address comments. Section 3 massaging. * Address comments.
* [Docs] Convert Layout pass. * Address comments. Section 3 massaging. * Address comments.
* [Docs] Convert Layout pass. * Address comments. Section 3 massaging. * Address comments.
@yzhliu @zhiics @tqchen @trevor-m