Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELAY] Add MergeCompilerRegions pass #5134

Merged
merged 18 commits into from
Mar 30, 2020
Merged

Conversation

mbaret
Copy link
Contributor

@mbaret mbaret commented Mar 23, 2020

This is part of the infrastructure necessary to support multiple outputs in the BYOC graph partitioning framework. It corresponds to step 3 of the flow outlined here and implements the algorithm described in this RFC. In summary, it merges together consecutive operators supported by the same target while ensuring that no data dependency issues are created.

This PR depends upon the utility functions provided by the AnnotationRegionSet PR and the Improved graph partitioner.

@mbaret
Copy link
Contributor Author

mbaret commented Mar 23, 2020

cc @zhiics @comaniac @masahi @tqchen

src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
src/relay/transforms/annotate_target.cc Outdated Show resolved Hide resolved
@masahi
Copy link
Member

masahi commented Mar 24, 2020

@mbaret Sorry I haven't been following the development of a new partitioning algorithm. Is this related to annotator support for composite?

@mbaret
Copy link
Contributor Author

mbaret commented Mar 25, 2020

It isn't (although I hope to handle that next). This is to help support the case where we partition into a region that has multiple outputs.

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comaniac please review the algorithm part as you also worked on it before.

src/relay/transforms/annotate_target.cc Outdated Show resolved Hide resolved
src/relay/transforms/annotate_target.cc Outdated Show resolved Hide resolved
src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm behavior seems fine to me.

src/relay/transforms/merge_compiler_regions.cc Outdated Show resolved Hide resolved
// By now, all the nodes have some sort of annotation.
// Region merger is ExprVisitor that will update the
// AnnotatedRegionSet.
RegionMerger merger(regions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is confusing. RegionMerger seems like a mutator but it is actually a visitor. Maybe something like MergableRegionAnalyzer. Or it's better to combine this one with MergeAnnotation. The logic of MergeAnnotation is pretty simple and it can be done in this pass as well, making this pass a real "merger".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can easily combine the two passes because of the visit order. I agree the name is slightly confusing, but it does merge 'Regions' in that you pass it an AnnotatedRegionSet and it returns a merged version of that set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regions it merges is not a Relay graph but a separate data structure. It looks weird for a "merger" to only refer the given Relay graph and manipulate a separate data structure.

Another way I can think of to make it more reasonable is moving RegionMerger to the AnnotateRegionSet and make it optional, like:

AnnotatedRegionSet regions = AnnotatedRegionSet::Create(
  expr_all_annotated, compiler_begin_op, compiler_end_op, true);

where the last argument indicates if the created regions should be merged.

Anyway for the time being, we could revisit this part in the future.

@trevor-m
Copy link
Contributor

trevor-m commented Mar 27, 2020

Hello, I am trying to use this (in combination with #5143) but I am getting a segfault every time I call MergeCompilerRegions. Here is a small repro script. I expect the entire model to be partitioned into a single subgraph.

from tvm import relay
import mxnet as mx
from mxnet.gluon.model_zoo.vision import get_model
from tvm.relay import op as reg

def _register_external_op_helper(op_name, supported=True):
    @reg.register(op_name, "target.test")
    def _func_wrapper(attrs, args):
        return supported
    return _func_wrapper

_register_external_op_helper("nn.conv2d")
_register_external_op_helper("nn.dense")
_register_external_op_helper("nn.relu")
_register_external_op_helper("add")
_register_external_op_helper("multiply")
_register_external_op_helper("nn.bias_add")
_register_external_op_helper("nn.batch_flatten")
_register_external_op_helper("nn.max_pool2d")
_register_external_op_helper("nn.dropout")
_register_external_op_helper("nn.batch_norm")
_register_external_op_helper("nn.global_avg_pool2d")

def test_resnet():
    block = get_model('resnet18_v1', pretrained=True)
    mod, params = relay.frontend.from_mxnet(block, shape={'data': (1, 3, 224, 224)}, dtype='float32')
    mod = relay.transform.AnnotateTarget("test")(mod)
    mod = relay.transform.MergeCompilerRegions()(mod)
    mod = relay.transform.PartitionGraph()(mod)
    print(mod)

test_resnet()

@comaniac
Copy link
Contributor

@mbaret could you reproduce the failure and try to resolve it? It crashed before partition graph so it should not be the issue of multiple output support.

@mbaret
Copy link
Contributor Author

mbaret commented Mar 27, 2020

@mbaret could you reproduce the failure and try to resolve it? It crashed before partition graph so it should not be the issue of multiple output support.

I've found the issue (was to do with changes I made to the merging during code review on the AnnotatedRegionSet patch). I've made the necessary change in this PR, but we'll also need this fix PR. I've also added in a more extensive test (based on the one in your PR) to verify the issue is resolved.

@zhiics
Copy link
Member

zhiics commented Mar 27, 2020

Can you just incorporate that change as it is really short?

@zhiics
Copy link
Member

zhiics commented Mar 27, 2020

also please fix @comaniac's comments

@mbaret
Copy link
Contributor Author

mbaret commented Mar 27, 2020

I've incorporated that fix into this PR now.

mbaret and others added 15 commits March 27, 2020 16:29
This pass is part of the flow to support creating compiler
regions with multiple outputs. It should be called after
AnnotateTarget and will merge together regions that share
the same target to create larger compiler regions that can
be off-loaded to external codegens.

This pass implements an algorithm to ensure that during the
merging, no data dependency issues are created. See the tests
for an example of this case.

Co-authored-by: Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
Co-authored-by: Manupa Karunaratne    <manupa.karunaratne@arm.com>

Change-Id: Ibd99083564608d888482f57c5080109f3eefec88
This alters the behaviour of the AnnotateTarget
pass to enforce the property that all compiler
annotations exist along a single data flow edge.
Specifically, this means they should have exactly
one parent and one child.

Change-Id: I0e74803a77767f4f377d17755a13a74a30909797
Deleting items from a list while iterating it seems to
result in undefined behaviour which sometimes segfaults.
This makes sure all the item deletion happens separately.
@trevor-m
Copy link
Contributor

trevor-m commented Mar 27, 2020

@mbaret Thanks for investigating and making the fixes! However, I'm still getting the segfault on call to MergeCompilerRegions in that example I posted earlier using your fixes. Are you able to run that example successfully?

@mbaret
Copy link
Contributor Author

mbaret commented Mar 27, 2020

There's still another issue to resolve with this (in AnnotateRestDefault). I'll take a look at it on Monday.

@tqchen tqchen assigned tqchen and unassigned tqchen Mar 30, 2020
@tqchen tqchen merged commit 0212138 into apache:master Mar 30, 2020
@tqchen
Copy link
Member

tqchen commented Mar 30, 2020

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020
* [RELAY] Add MergeCompilerRegions pass

This pass is part of the flow to support creating compiler
regions with multiple outputs. It should be called after
AnnotateTarget and will merge together regions that share
the same target to create larger compiler regions that can
be off-loaded to external codegens.

This pass implements an algorithm to ensure that during the
merging, no data dependency issues are created. See the tests
for an example of this case.

Co-authored-by: Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
Co-authored-by: Manupa Karunaratne    <manupa.karunaratne@arm.com>

Change-Id: Ibd99083564608d888482f57c5080109f3eefec88

* [RELAY] Annotate compiler_ends on each edge

This alters the behaviour of the AnnotateTarget
pass to enforce the property that all compiler
annotations exist along a single data flow edge.
Specifically, this means they should have exactly
one parent and one child.

Change-Id: I0e74803a77767f4f377d17755a13a74a30909797

* Fix comment

* Rebase *Node::make

* Moved block outside for loop

* Code style

* Update make API

* Remove comment

* Remove redundant 'else's

* Make one line

* Fix comment

* RefWrite

* Fix merge ordering

* Add the RFC example as a test

* [FIX] Fixed merging behaviour in AnnotateRegionSet

Deleting items from a list while iterating it seems to
result in undefined behaviour which sometimes segfaults.
This makes sure all the item deletion happens separately.

* Added checks

* Move comment

* Update comments
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020
* [RELAY] Add MergeCompilerRegions pass

This pass is part of the flow to support creating compiler
regions with multiple outputs. It should be called after
AnnotateTarget and will merge together regions that share
the same target to create larger compiler regions that can
be off-loaded to external codegens.

This pass implements an algorithm to ensure that during the
merging, no data dependency issues are created. See the tests
for an example of this case.

Co-authored-by: Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
Co-authored-by: Manupa Karunaratne    <manupa.karunaratne@arm.com>

Change-Id: Ibd99083564608d888482f57c5080109f3eefec88

* [RELAY] Annotate compiler_ends on each edge

This alters the behaviour of the AnnotateTarget
pass to enforce the property that all compiler
annotations exist along a single data flow edge.
Specifically, this means they should have exactly
one parent and one child.

Change-Id: I0e74803a77767f4f377d17755a13a74a30909797

* Fix comment

* Rebase *Node::make

* Moved block outside for loop

* Code style

* Update make API

* Remove comment

* Remove redundant 'else's

* Make one line

* Fix comment

* RefWrite

* Fix merge ordering

* Add the RFC example as a test

* [FIX] Fixed merging behaviour in AnnotateRegionSet

Deleting items from a list while iterating it seems to
result in undefined behaviour which sometimes segfaults.
This makes sure all the item deletion happens separately.

* Added checks

* Move comment

* Update comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants