Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Added Merge Composite pass #4771

Merged
merged 23 commits into from
Feb 10, 2020
Merged

Conversation

mbaret
Copy link
Contributor

@mbaret mbaret commented Jan 23, 2020

This adds a pass, MergeComposite, which merges patterns (expressed as Relay expressions). The detected patterns are wrapped in a function call which is marked with a 'Composite' attribute that names the pattern.

This is primarily for use with the external codegen infrastructure in the case where a combination of Relay ops map to a single external codegen op. It is not the same as fusion.

Further discussion on this PR can be found at https://discuss.tvm.ai/t/rfc-external-codegen-defining-composite-relay-operators/5470.

@mbaret mbaret requested review from zhiics and masahi January 23, 2020 11:50
@masahi
Copy link
Member

masahi commented Jan 24, 2020

ok reading your original proposal carefully I understand that I can look up 'Composite' attribute to know if a particular function has a pattern I'm looking for. Of course I still need to traverse the arguments, but I can remove the "detection" logic from "traversal" logic in DetectFusedConv2DBiasReLU.

It seems I have to make only minimal change to make use of this feature in my PR. Also it enables removing my custom annotator. Sounds great!

@masahi
Copy link
Member

masahi commented Jan 24, 2020

ok I already have a working test case for detecting conv + bias + relu (Note the "Composite" attribute). Now I really want to hook this up with "Compiler" and "ExternalSymbol" attributes to enable codegen.

def @main(%data: Tensor[(1, 3, 224, 224), float32]) -> Tensor[(1, 16, 224, 224), float32] {
  %2 = fn (%data1: Tensor[(1, 3, 224, 224), float32], %weight: Tensor[(16, 3, 3, 3), float32], %bias: Tensor[(16, 1, 1), float32], Primitive=1, Composite="dnnl.conv_bias_relu") -> Tensor[(1, 16, 224, 224), float32] {
    %0 = nn.conv2d(%data1, %weight, padding=[1, 1], channels=16, kernel_size=[3, 3]) /* ty=Tensor[(1, 16, 224, 224), float32] */;
    %1 = add(%0, %bias) /* ty=Tensor[(1, 16, 224, 224), float32] */;
    nn.relu(%1) /* ty=Tensor[(1, 16, 224, 224), float32] */
  };
  %3 = %2(%data, meta[relay.Constant][0] /* ty=Tensor[(16, 3, 3, 3), float32] */ /* ty=Tensor[(16, 3, 3, 3), float32] */, meta[relay.Constant][1] /* ty=Tensor[(16, 1, 1), float32] */ /* ty=Tensor[(16, 1, 1), float32] */) /* ty=Tensor[(1, 16, 224, 224), float32] */;
  %6 = fn (%data2: Tensor[(1, 16, 224, 224), float32], %weight1: Tensor[(16, 16, 3, 3), float32], %bias1: Tensor[(16, 1, 1), float32], Primitive=1, Composite="dnnl.conv_bias_relu") -> Tensor[(1, 16, 224, 224), float32] {
    %4 = nn.conv2d(%data2, %weight1, padding=[1, 1], channels=16, kernel_size=[3, 3]) /* ty=Tensor[(1, 16, 224, 224), float32] */;
    %5 = add(%4, %bias1) /* ty=Tensor[(1, 16, 224, 224), float32] */;
    nn.relu(%5) /* ty=Tensor[(1, 16, 224, 224), float32] */
  };
  %6(%3, meta[relay.Constant][2] /* ty=Tensor[(16, 16, 3, 3), float32] */ /* ty=Tensor[(16, 16, 3, 3), float32] */, meta[relay.Constant][3] /* ty=Tensor[(16, 1, 1), float32] */ /* ty=Tensor[(16, 1, 1), float32] */) /* ty=Tensor[(1, 16, 224, 224), float32] */
}

@mbaret
Copy link
Contributor Author

mbaret commented Jan 24, 2020

To hook it up end-to-end will, for now, require writing a custom annotation pass that can recognise composite functions and understand that they need to be treated in the same way as operators. I've got a generic pass that does this, but it builds on top of @zhiics annotation PR which has since been withdrawn. If we agree that this merge_composite pass is useful then I can discuss with zhiics how we want to move forward with the annotator.

@masahi
Copy link
Member

masahi commented Jan 24, 2020

If there are multiple patterns to detect, can a composite function detected by one pattern be used as a part of a match for another patterns?

Not sure if this is useful though. Maybe it enables breaking up a big pattern into chunks or pattern reuse.

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. We may start from this PR and gradually figure out useful patterns for op-level annotation.

One suggestion is that could we limit this pass to only work for external functions? From the unit tests I suppose this pass works for an entire Relay program, but in practice we want it to work on the subgraph in compiler_begin and compiler_end.

src/relay/pass/merge_composite.cc Outdated Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Jan 27, 2020

It's the intention that this can be called on the entire Relay graph so that it can be used to help implement a generic annotation pass (one that is aware of composite functions). That way we can define functions similar to the 'Is Supported?' mechanism in the original annotation PR (since taken down) where you could declare in Python whether an operator was supported. That could be extended to say whether a composite function is supported without having to add pattern matching code to the annotator.

The problem there is the case where composite functions do not end up in an external function after partitioning. My thinking is to have some legalize pass after the partitioning that removes the composite functions from sections of the graph not marked as external.

@comaniac
Copy link
Contributor

So you want the flow to be:

CompositeMerge -> Annotation -> Partitioning. 

I agree that this would make annotation generic and straightforward, although it seems like we don't need annotation anymore if we specify all patterns including single ops. While there are lots of approaches to do so, maybe we could accept this solution first and consider the further steps. What you do think? @zhiics

Also, please help clarify the question asked by @masahi and me about multiple matching. Thanks.

@mbaret
Copy link
Contributor Author

mbaret commented Jan 28, 2020

I think we probably can get rid of the annotation pass. If we expose the logic that determines whether an operator is supported by a given external compiler to the partitioning pass, then this will also mean we don't need to define a pattern for every single operator. Just define patterns for composite operators and have a common way of expressing whether or not a given operator (composite or otherwise) is supported by an external compiler.

That would lose some of the control that could have been gained by writing a custom annotation pass, but I'm not sure what use cases would leverage that.

@comaniac
Copy link
Contributor

That's exactly what I was thinking -- we can debate if we should merge annotation to this pass, or just get rid of the annotation and make this pass general enough. Anyway can be discussed in the follow-up steps.

@masahi
Copy link
Member

masahi commented Jan 28, 2020

do we still need partitioning if we get rid of the annotation pass? At least for simple patterns like conv + bias + relu, I can get the same composite (or partitioning, whatever) without writing a clunky custom annotator and doing the partitioning pass.

@mbaret
Copy link
Contributor Author

mbaret commented Jan 28, 2020

It's important to emphasise the difference between partitioned subgraphs (which correspond to 'External' functions), and the merged patterns here (which correspond to 'Composite' functions).

Composite functions are a way to group together relay patterns which correspond to a particular external codegen function (eg. qnn_conv2d + bias_add + requantize maps to a single convolution function in Arm Compute Library). External functions are a way to represent a subgraph that should not go via TVM codegen but should instead go through an external compiler (eg. DNNL).

For libraries like ACL or DNNL, it happens to be the case that you can just put every supported operator/composite function in it's own external function (partitioned subgraph) and because the library calls are standalone that works well enough.

A more complex external compiler though may want to do some advanced fusion or manipulation which means it needs to see the full supported subgraph, not just a single operator/composite function at a time. It will produce some binary artifact that computes the entire subgraph rather than a particular operator within it. All that complexity should be handled by the external compiler's codegen pass, the responsibility of the partitioner is to ensure that pass receives a valid subgraph that is as large as possible while still being fully supported and introducing no dataflow problems.

In summary, I don't think this pass replaces partitioning which is essential for the 'complex compiler' case, but I think it may replace the annotation pass.

@masahi
Copy link
Member

masahi commented Jan 28, 2020

thanks, such "complex compiler" use cases sounds quite interesting and the difference in Composite/External there makes a lot of sense. I also understand your earlier comments better now.

@comaniac
Copy link
Contributor

While @mbarrett97 has pointed out the difference between composite/annotation and partition, I just provide one more clarification: The main task for the partition pass is transforming compiler annotations in a Relay graph to external functions and it is compiler agnostic. It means we may even consider the possibility of partitioning a Relay graph to more than one external compilers.

@mbaret
Copy link
Contributor Author

mbaret commented Feb 3, 2020

I've updated the PR to account for the priority ordering of patterns. You no longer specify a pattern map, but a list of tuples (pattern_name, pattern) where the order or the tuples corresponds to the order of the matching. To get this to work, I had to pass strings between Python/C++ requiring this change: #4806.

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the priority mechanism for multiple patterns is sufficient for now. Overall LGTM.
Just one minor question left in comments.

python/tvm/relay/transform.py Outdated Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Feb 4, 2020

I've done a big clean up + simplified the code a bit and added some comments.

Change-Id: I43a93995bbf10530399900c992aa99dd4ae4575f
Change-Id: I0faef77a66c55f83f09e6e47c561ffaea63dedfa
Change-Id: Ifdd02c12087a7e1a0a9b54825669bc0de8f13c3d
This is not necessary now that ExtractPattern
can fulfill the same purpose.

Change-Id: I14dc020afa8e50f2df4c0a2efb88a011987f8196
Change-Id: I8b50f0c9069aa1bcaccbe68eb421031f01a64842
Change-Id: Ib1959a35c856e7ea5639de2e4ef314a54f44caf5
Change-Id: I2b7f273db275964ec0e9820560663f0808adee79
Change-Id: I4eeea3ce723d3ba337d110dcc690377daebe8626
Change-Id: I07f5392c0e95cfe3cfa5c333703cc6f82d6034fb
Change-Id: I5c5d62d3cd57f72508b30b926f72091ae6f0d1cc
Change-Id: I23a7897ca15a7cd076db5039dc653a4b8c27e803
Change-Id: I377f0a1c1ac70f3b8d7584b0c49bddc8c6c134ef
Change-Id: I78e36d805e8ed6b55e61d490212a967c857554a4
Change-Id: Ib1d800409fca4c1834c7fe0cab5a26ab99a26820
@zhiics
Copy link
Member

zhiics commented Feb 10, 2020

@mbarrett97 Considering that this is a relatively standalone pass and it helps fusion with external functions, I think it is okay to take it in.

For the pattern matching cases, one possible way I am thinking of is automatically generating some possible pattern given some metadata.

@comaniac and I will start an RFC to discuss how we can add the whitelist based annotation back. It probably will leverage the pattern matching here.

@masahi @anijain2305 PTAL and share your comments if there is any. Thanks.

@anijain2305
Copy link
Contributor

My only request is to add more test cases. In my experience, things get very ugly as the networks get bigger and the things that seems corner cases become very common. But, I am ok with the scope of this PR and delaying aggressive testing to later PR. I think it helps in quick flush of e2e pileline.

@anijain2305
Copy link
Contributor

Oh just saw. The tests are already added. Thanks, its good from my side :)

@masahi
Copy link
Member

masahi commented Feb 10, 2020

@zhiics @comaniac It is worth discussing if we can use composite and partitioning passes to remove the annotation pass, as mentioned by @mbarrett97

@masahi masahi merged commit 0dbe70c into apache:master Feb 10, 2020
@masahi
Copy link
Member

masahi commented Feb 10, 2020

Thanks @mbarrett97 @comaniac @zhiics @anijain2305 this is merged.

@mbaret
Copy link
Contributor Author

mbaret commented Feb 10, 2020

@masahi @comaniac @zhiics Thanks for the reviews. An RFC on alternative annotation mechanisms would be great.

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020
* [Relay] Added MergeComposite pass

This pass allows for patterns to be wrapped
in a function marked with 'Composite' and a
composite function name. This is intended to be
used with the external codegen for the cases where
an external operator maps to multiple Relay
operators. In that case, the mapping can be expressed
as a pattern and assigned a name.

For more information on this pass and its motivation,
see the RFC:
https://discuss.tvm.ai/t/rfc-external-codegen-defining-composite-relay-operators/5470

Change-Id: Icb1b803a9f0ac57c529143200228f3bb5793afc0

* [Relay] Merge composite tests

Added tests for the merge_composite pass.

Change-Id: I1728b4a05b0c1c36140a40f1afe028fde62185dd

* Merge composite additional test

Change-Id: I9bc7d6053c575e9468ac5abc31214c6ad8507e46

* Support priority order in merge_composite

The order in which the patterns are matched
was currently random as an unordered_map was
used to store the pattern table. This uses
arrays instead so that a distinct priority
order of matching can be defined. Additional
tests have also been added to verify this
behaviour.

Change-Id: Ief347df4262639138d5d9d7c8cee7ef233af7b56

* Improved merge composite docs

Change-Id: Ie3a72045ecc3f13ad3c302fbdf192b7296a306a8

* Removed unused variable

Change-Id: I7814d5fde368ffaf1b3d6d806060c774c7720364

* Remove unnecessary op check

Change-Id: I38e78d2acd5b86cb8e837be72ff9d72cd10bcf33

* Improve styling on composite function creation

Change-Id: I37add1c3134e0b5d5085fe1eb9daf8e06890fa8c

* Comment reword

Change-Id: Ie05872dcbbe0c3e1190b0597083b9a64e6b66c66

* Stylistic changes to avoid std::move

Change-Id: I43a93995bbf10530399900c992aa99dd4ae4575f

* Relax a check in ExtractPattern

Change-Id: I0faef77a66c55f83f09e6e47c561ffaea63dedfa

* Remove new line

Change-Id: Ifdd02c12087a7e1a0a9b54825669bc0de8f13c3d

* Removed MatchPattern from MergeComposite

This is not necessary now that ExtractPattern
can fulfill the same purpose.

Change-Id: I14dc020afa8e50f2df4c0a2efb88a011987f8196

* Removed a new line

Change-Id: I8b50f0c9069aa1bcaccbe68eb421031f01a64842

* Improved docs for merge composite

Change-Id: Ib1959a35c856e7ea5639de2e4ef314a54f44caf5

* Fixed free vars in test

Change-Id: I2b7f273db275964ec0e9820560663f0808adee79

* Handle case where root arg might not be a call

Change-Id: I4eeea3ce723d3ba337d110dcc690377daebe8626

* Removed blank line

Change-Id: I07f5392c0e95cfe3cfa5c333703cc6f82d6034fb

* Change to CHECK_EQ

Change-Id: I5c5d62d3cd57f72508b30b926f72091ae6f0d1cc

* Revised a conditional

Change-Id: I23a7897ca15a7cd076db5039dc653a4b8c27e803

* Improved doc styling

Change-Id: I377f0a1c1ac70f3b8d7584b0c49bddc8c6c134ef

* Fail extraction if vars conflict

Change-Id: I78e36d805e8ed6b55e61d490212a967c857554a4

* Added further merge composite tests

Change-Id: Ib1d800409fca4c1834c7fe0cab5a26ab99a26820

Co-authored-by: lhutton1 <35535092+lhutton1@users.noreply.github.com>
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020
* [Relay] Added MergeComposite pass

This pass allows for patterns to be wrapped
in a function marked with 'Composite' and a
composite function name. This is intended to be
used with the external codegen for the cases where
an external operator maps to multiple Relay
operators. In that case, the mapping can be expressed
as a pattern and assigned a name.

For more information on this pass and its motivation,
see the RFC:
https://discuss.tvm.ai/t/rfc-external-codegen-defining-composite-relay-operators/5470

Change-Id: Icb1b803a9f0ac57c529143200228f3bb5793afc0

* [Relay] Merge composite tests

Added tests for the merge_composite pass.

Change-Id: I1728b4a05b0c1c36140a40f1afe028fde62185dd

* Merge composite additional test

Change-Id: I9bc7d6053c575e9468ac5abc31214c6ad8507e46

* Support priority order in merge_composite

The order in which the patterns are matched
was currently random as an unordered_map was
used to store the pattern table. This uses
arrays instead so that a distinct priority
order of matching can be defined. Additional
tests have also been added to verify this
behaviour.

Change-Id: Ief347df4262639138d5d9d7c8cee7ef233af7b56

* Improved merge composite docs

Change-Id: Ie3a72045ecc3f13ad3c302fbdf192b7296a306a8

* Removed unused variable

Change-Id: I7814d5fde368ffaf1b3d6d806060c774c7720364

* Remove unnecessary op check

Change-Id: I38e78d2acd5b86cb8e837be72ff9d72cd10bcf33

* Improve styling on composite function creation

Change-Id: I37add1c3134e0b5d5085fe1eb9daf8e06890fa8c

* Comment reword

Change-Id: Ie05872dcbbe0c3e1190b0597083b9a64e6b66c66

* Stylistic changes to avoid std::move

Change-Id: I43a93995bbf10530399900c992aa99dd4ae4575f

* Relax a check in ExtractPattern

Change-Id: I0faef77a66c55f83f09e6e47c561ffaea63dedfa

* Remove new line

Change-Id: Ifdd02c12087a7e1a0a9b54825669bc0de8f13c3d

* Removed MatchPattern from MergeComposite

This is not necessary now that ExtractPattern
can fulfill the same purpose.

Change-Id: I14dc020afa8e50f2df4c0a2efb88a011987f8196

* Removed a new line

Change-Id: I8b50f0c9069aa1bcaccbe68eb421031f01a64842

* Improved docs for merge composite

Change-Id: Ib1959a35c856e7ea5639de2e4ef314a54f44caf5

* Fixed free vars in test

Change-Id: I2b7f273db275964ec0e9820560663f0808adee79

* Handle case where root arg might not be a call

Change-Id: I4eeea3ce723d3ba337d110dcc690377daebe8626

* Removed blank line

Change-Id: I07f5392c0e95cfe3cfa5c333703cc6f82d6034fb

* Change to CHECK_EQ

Change-Id: I5c5d62d3cd57f72508b30b926f72091ae6f0d1cc

* Revised a conditional

Change-Id: I23a7897ca15a7cd076db5039dc653a4b8c27e803

* Improved doc styling

Change-Id: I377f0a1c1ac70f3b8d7584b0c49bddc8c6c134ef

* Fail extraction if vars conflict

Change-Id: I78e36d805e8ed6b55e61d490212a967c857554a4

* Added further merge composite tests

Change-Id: Ib1d800409fca4c1834c7fe0cab5a26ab99a26820

Co-authored-by: lhutton1 <35535092+lhutton1@users.noreply.github.com>
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020
* [Relay] Added MergeComposite pass

This pass allows for patterns to be wrapped
in a function marked with 'Composite' and a
composite function name. This is intended to be
used with the external codegen for the cases where
an external operator maps to multiple Relay
operators. In that case, the mapping can be expressed
as a pattern and assigned a name.

For more information on this pass and its motivation,
see the RFC:
https://discuss.tvm.ai/t/rfc-external-codegen-defining-composite-relay-operators/5470

Change-Id: Icb1b803a9f0ac57c529143200228f3bb5793afc0

* [Relay] Merge composite tests

Added tests for the merge_composite pass.

Change-Id: I1728b4a05b0c1c36140a40f1afe028fde62185dd

* Merge composite additional test

Change-Id: I9bc7d6053c575e9468ac5abc31214c6ad8507e46

* Support priority order in merge_composite

The order in which the patterns are matched
was currently random as an unordered_map was
used to store the pattern table. This uses
arrays instead so that a distinct priority
order of matching can be defined. Additional
tests have also been added to verify this
behaviour.

Change-Id: Ief347df4262639138d5d9d7c8cee7ef233af7b56

* Improved merge composite docs

Change-Id: Ie3a72045ecc3f13ad3c302fbdf192b7296a306a8

* Removed unused variable

Change-Id: I7814d5fde368ffaf1b3d6d806060c774c7720364

* Remove unnecessary op check

Change-Id: I38e78d2acd5b86cb8e837be72ff9d72cd10bcf33

* Improve styling on composite function creation

Change-Id: I37add1c3134e0b5d5085fe1eb9daf8e06890fa8c

* Comment reword

Change-Id: Ie05872dcbbe0c3e1190b0597083b9a64e6b66c66

* Stylistic changes to avoid std::move

Change-Id: I43a93995bbf10530399900c992aa99dd4ae4575f

* Relax a check in ExtractPattern

Change-Id: I0faef77a66c55f83f09e6e47c561ffaea63dedfa

* Remove new line

Change-Id: Ifdd02c12087a7e1a0a9b54825669bc0de8f13c3d

* Removed MatchPattern from MergeComposite

This is not necessary now that ExtractPattern
can fulfill the same purpose.

Change-Id: I14dc020afa8e50f2df4c0a2efb88a011987f8196

* Removed a new line

Change-Id: I8b50f0c9069aa1bcaccbe68eb421031f01a64842

* Improved docs for merge composite

Change-Id: Ib1959a35c856e7ea5639de2e4ef314a54f44caf5

* Fixed free vars in test

Change-Id: I2b7f273db275964ec0e9820560663f0808adee79

* Handle case where root arg might not be a call

Change-Id: I4eeea3ce723d3ba337d110dcc690377daebe8626

* Removed blank line

Change-Id: I07f5392c0e95cfe3cfa5c333703cc6f82d6034fb

* Change to CHECK_EQ

Change-Id: I5c5d62d3cd57f72508b30b926f72091ae6f0d1cc

* Revised a conditional

Change-Id: I23a7897ca15a7cd076db5039dc653a4b8c27e803

* Improved doc styling

Change-Id: I377f0a1c1ac70f3b8d7584b0c49bddc8c6c134ef

* Fail extraction if vars conflict

Change-Id: I78e36d805e8ed6b55e61d490212a967c857554a4

* Added further merge composite tests

Change-Id: Ib1d800409fca4c1834c7fe0cab5a26ab99a26820

Co-authored-by: lhutton1 <35535092+lhutton1@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants