[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U #13212

sergio-grovety · 2022-10-27T07:29:47Z

Added an option to tvmc and Ethos-U for printing to console or to the file which operators from the initial graph are offloaded to Ethos-U and which aren't. It forms line-by-line output of initial model IR, indicating which operations ported to Ethos-U.

Compiler option "--target-ethos-u-dump_npu_functions_coverage" has been replaced by more generic "--dump-offloads" with the same meaning.

Usage

# output to console:
tvmc compile --target=ethos-u,cmsis-nn,c \
    --dump-offloads=- \
    ........

# output to file:
tvmc compile --target=ethos-u,cmsis-nn,c \
    --dump-offloads=<file path> \
    ........

Example output:

...
Total number of operators and distribution by targets
Total: 211
target1: 198
target2: 10
generic: 3

'target1 <- target2.qnn_conv2d'
'target1 <- %0 = qnn.conv2d(%tfl.quantize, %v_param_1, ...'
'target1 <- %1 = nn.bias_add(%0, %v_param_2, axis=3);'
'target1 <- %2 = qnn.requantize(%1, meta[relay.Constant]...'
'target2 <- target2.reshape'
'target2 <- %3 = reshape(%2, newshape=[1, 1001]);'
'generic <- %4 = nn.pad(%3, -128f, pad_width=[[0, 0], [1, 1]...'
...

Usage

# output to console: tvmc compile --target=ethos-u,cmsis-nn,c \ --target-ethos-u-dump_npu_functions_coverage=- \ ........ # output to file: tvmc compile --target=ethos-u,cmsis-nn,c \ --target-ethos-u-dump_npu_functions_coverage=<file path> \ ........

Example output:

...
ethos-u <- %1 = nn.bias_add(%0, %v_param_2, axis=3);
ethos-u <- %2 = qnn.requantize(%1, meta[relay.Constant][1], 0, 0.0235294f, -128, axis=3, out_dtype="int8");
ethos-u <- %3 = clip(%2, a_min=-128f, a_max=127f);
....

tvm-bot · 2022-10-27T07:29:52Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Mousius, @gromero, @leandron, @lhutton1 _{See #10317 for details}
Built docs for commit e9e8a68 can be found here.

_{Generated by tvm-bot}

sergio-grovety · 2022-10-31T12:16:58Z

@leandron could you please check it?

lhutton1

Thanks @sergey-grovety, this looks like a very helpful addition for users to see how their model is partitioned! I took a quick look and had a couple of high-level questions.

The option for printing the operators currently seems very specific to the NPU, I'm wondering if we would see more benefit adding this as a generic option within TVMC without too many changes? Not only would it benefit other targets, it would make the option more robust and easier to find from a user POV. Its currently possible to save the partitioned graph in TVMC using --dump-code="relay", perhaps print_operators_offloading could be called at a similar point (given a command line argument such as --dump-offloads) rather than from within the NPU specific code, WDYT?

I'm also wondering how much information a user would be able to understand from the Relay output if they're unfamiliar with it. For example, if there was a TFLite graph consisting of a single CONV2D operation, it seems like the current output would display 4 operations being offloaded to the NPU (qnn.conv2d -> bias_add -> requantize -> clip), which might be a bit confusing for a non-experienced user. Linking back to the original TFLite operation might be tricky, but we have the NPU composite operations that have a similar relationship. Perhaps we could display this like below with indentation indicating Relay operations that make up the composite operation? Happy to hear other suggestions though :)

ethos-u    <-   ethos-u.qnn_conv2d
ethos-u    <-       %1 = nn.bias_add(%0, %v_param_2, axis=3);
ethos-u    <-       %2 = qnn.requantize(%1, meta[relay.Constant][1], 0, 0.0235294f, -128, axis=3, out_dtype="int8");
ethos-u    <-       %3 = clip(%2, a_min=-128f, a_max=127f);

Also cc @ekalda, @ashutosh-arm who may be interested

src/relay/backend/contrib/ethosu/compiler_attrs.cc

ashutosh-arm · 2022-11-02T10:37:06Z

I agree with @lhutton1 here. The knob --dump-code="relay" provides a way to visualize the post-partition relay model. main function in this relay model lists sequence of calls to partitioned functions with appropriate target annotations. Does the new knob print_operators_offloading serve any additional purpose that I might have missed @sergey-grovety ? To be fair, I have only read the PR description 😅

arina-grovety · 2022-11-09T13:30:40Z

I agree with @lhutton1 here. The knob --dump-code="relay" provides a way to visualize the post-partition relay model. main function in this relay model lists sequence of calls to partitioned functions with appropriate target annotations. Does the new knob print_operators_offloading serve any additional purpose that I might have missed @sergey-grovety ? To be fair, I have only read the PR description 😅

Hi @ashutosh-arm sorry for my late reply. As I see it, the main purpose of the new option is to show the correspondence between the operators from the original graph and the final operations offloading on the target. This is displayed as a sequential printout of the source relay's operations, with the composites from which they are derived and the target to which they are unloaded.

Another point worth highlighting is the partitioned Relay, which is an output of --dump-code="relay", have relay operation's numbers ( %...) different from those in the initial Relay. Therefore, the new knob, which keeps the initial Relay's numbers, can be handy

Here is an example output with the new option:

'ethos-u    <- ethos-u.qnn_conv2d'
'ethos-u    <-        %204 = qnn.conv2d(%203, %v_param_105, -128, 0, 0.0235294f, ...'
'ethos-u    <-        %205 = nn.bias_add(%204, %v_param_106, axis=3);'
'ethos-u    <-        %206 = qnn.requantize(%205, meta[relay.Constant][105], 0, ...'
'ethos-u    <- ethos-u.reshape'
'ethos-u    <-        %207 = reshape(%206, newshape=[1, 1001]);'

arina-grovety · 2022-11-09T14:25:54Z

The option for printing the operators currently seems very specific to the NPU, I'm wondering if we would see more benefit adding this as a generic option within TVMC without too many changes? Not only would it benefit other targets, it would make the option more robust and easier to find from a user POV. Its currently possible to save the partitioned graph in TVMC using --dump-code="relay", perhaps print_operators_offloading could be called at a similar point (given a command line argument such as --dump-offloads) rather than from within the NPU specific code, WDYT?

Hi @lhutton1,

Do you propose to implement this function for all the targets? Or just add a general compiler option leaving the implementation currently only in the ethos-u backend?
Right now, this function is specific to ethos-u and is handled in the ethos-u backend.
As far as I'm concerned it won't be a problem to implement the function for all targets, but of course I could be wrong.

Here is an example how the output would look like if model is compiled for the target "llvm":

   'generic    <-   %0 = qnn.conv2d(%tfl.quantize, %v_param_1, ...'
   'generic    <-   %1 = nn.bias_add(%0, %v_param_2, axis=3);'
   'generic    <-   %2 = qnn.requantize(%1, meta[relay.Constant]...'

And for targets "ethos-u,cmsis-nn,c"

    'ethos-u    <- ethos-u.qnn_conv2d'
    'ethos-u    <-        %204 = qnn.conv2d(%203, %v_param_105, -128, 0, 0.0235294f, ...'
    'ethos-u    <-        %205 = nn.bias_add(%204, %v_param_106, axis=3);'
    'ethos-u    <-        %206 = qnn.requantize(%205, meta[relay.Constant][105], 0, ...'
    'ethos-u    <- ethos-u.reshape'
    'ethos-u    <-        %207 = reshape(%206, newshape=[1, 1001]);'
    'cmsis-nn   <- cmsis-nn.qnn_softmax'
    'cmsis-nn   <-        %208 = qnn.dequantize(%207, 0.0775722f, -61);'
    'cmsis-nn   <-        %209 = nn.softmax(%208);'
    'cmsis-nn   <-        qnn.quantize(%209, 0.00390625f, -128, out_dtype="int8")'

ashutosh-arm · 2022-11-10T10:23:03Z

Hi @ashutosh-arm sorry for my late reply. As I see it, the main purpose of the new option is to show the correspondence between the operators from the original graph and the final operations offloading on the target. This is displayed as a sequential printout of the source relay's operations, with the composites from which they are derived and the target to which they are unloaded.

Another point worth highlighting is the partitioned Relay, which is an output of --dump-code="relay", have relay operation's numbers ( %...) different from those in the initial Relay. Therefore, the new knob, which keeps the initial Relay's numbers, can be handy

Here is an example output with the new option:
'ethos-u    <- ethos-u.qnn_conv2d'

I see your point. Here are some points to consider:

print_operators_offloading would print the mapping of MicroNPU operators to the original Relay operators. Each target that uses partitioning has its own ways of defining these mappings. Some of them make use of MergeCompilerRegions that clubs multiple operators into a single partitioned function. Would it be possible to support additional targets given the knob name implies generic support in TVM?
Another thing in pipe for better debug from frontends down to TIR is Compiler Explorer: [Tracking Issue] TVM Explorer Infrastructure #13116. At some point in future, this should provide more transparency to the operator mappings.
In the mean time, with a little education to the end user of MicroNPU, learning to interpret partitioned functions could be made easy. Just a suggestion, this knowledge could be shared via TVM docs and/or MicroNPU demo.
Another reason for thinking twice is that this knob changes the output from TVM. We should seek more opinions from community maybe?

arina-grovety · 2022-11-10T15:07:33Z

Another reason for thinking twice is that this knob changes the output from TVM. We should seek more opinions from community maybe?

Can you please clarify what do you mean by "the output from TVM"?

ashutosh-arm · 2022-11-11T12:34:54Z

Can you please clarify what do you mean by "the output from TVM"?

Sorry for wrong wording. With this knob, TVM will produce a new debug output which would set an example for other backends. So, my suggestion was to discuss this upfront on the discuss forum.

lhutton1 · 2022-11-11T19:25:47Z

The option for printing the operators currently seems very specific to the NPU, I'm wondering if we would see more benefit adding this as a generic option within TVMC without too many changes? Not only would it benefit other targets, it would make the option more robust and easier to find from a user POV. Its currently possible to save the partitioned graph in TVMC using --dump-code="relay", perhaps print_operators_offloading could be called at a similar point (given a command line argument such as --dump-offloads) rather than from within the NPU specific code, WDYT?

Hi @lhutton1,

Do you propose to implement this function for all the targets? Or just add a general compiler option leaving the implementation currently only in the ethos-u backend? Right now, this function is specific to ethos-u and is handled in the ethos-u backend. As far as I'm concerned it won't be a problem to implement the function for all targets, but of course I could be wrong.

Here is an example how the output would look like if model is compiled for the target "llvm":
   'generic    <-   %0 = qnn.conv2d(%tfl.quantize, %v_param_1, ...'
   'generic    <-   %1 = nn.bias_add(%0, %v_param_2, axis=3);'
   'generic    <-   %2 = qnn.requantize(%1, meta[relay.Constant]...'
And for targets "ethos-u,cmsis-nn,c"
    'ethos-u    <- ethos-u.qnn_conv2d'
    'ethos-u    <-        %204 = qnn.conv2d(%203, %v_param_105, -128, 0, 0.0235294f, ...'
    'ethos-u    <-        %205 = nn.bias_add(%204, %v_param_106, axis=3);'
    'ethos-u    <-        %206 = qnn.requantize(%205, meta[relay.Constant][105], 0, ...'
    'ethos-u    <- ethos-u.reshape'
    'ethos-u    <-        %207 = reshape(%206, newshape=[1, 1001]);'
    'cmsis-nn   <- cmsis-nn.qnn_softmax'
    'cmsis-nn   <-        %208 = qnn.dequantize(%207, 0.0775722f, -61);'
    'cmsis-nn   <-        %209 = nn.softmax(%208);'
    'cmsis-nn   <-        qnn.quantize(%209, 0.00390625f, -128, out_dtype="int8")'

Thanks for the explanation @arina-grovety, yes I was thinking other backends like CMSIS-NN could make use of the same approach since the AnalyzeOperationsDistribution already seems quite generic. Where possible we can add the composite function names and if they are not found we can fallback to just printing the Relay - exactly as you described. If another backend has a different method of offloading operations this could simply be added to the pass in the future as and when needed.

I see @ashutosh-arm's point that this feature intersects with the work in #13116, perhaps it would be useful to have a discussion with the authors to align on expectations. After a quick look I don’t believe this work takes into account offloaded operations, but I could be wrong. Just like #13116 it would be great to see the ethos-u.qnn_conv2d annotations relate back to the input graph format (e.g. in TFLite: CONV2D) to make it easy for the user to relate their compiled operations to their original graph, but this seems a bit involved for now.

print_operators_offloading would print the mapping of MicroNPU operators to the original Relay operators. Each target that uses partitioning has its own ways of defining these mappings. Some of them make use of MergeCompilerRegions that clubs multiple operators into a single partitioned function. Would it be possible to support additional targets given the knob name implies generic support in TVM?

@ashutosh-arm, I think this is okay as operators will still be wrapped in their respective 'composite' function where the "ethos-u.qnn_conv2d" name is stored. Supporting other methods of operator offloading such as https://github.com/apache/tvm/blob/main/python/tvm/relay/op/contrib/ethosn.py#L451 I feel are out of scope for this work for the time being. I agree though that we should get the community opinion on this before making such a change, as you rightly mention the alternative is to educate the user how to read the partitioned Relay graph in the tutorial.

arina-grovety · 2022-11-28T22:13:28Z

Thanks for the explanation @arina-grovety, yes I was thinking other backends like CMSIS-NN could make use of the same approach since the AnalyzeOperationsDistribution already seems quite generic. Where possible we can add the composite function names and if they are not found we can fallback to just printing the Relay - exactly as you described. If another backend has a different method of offloading operations this could simply be added to the pass in the future as and when needed.

Hello @lhutton1, we have pushed update to the PR, option "--target-ethos-u-dump_npu_functions_coverage" has been replaced by more generic "--dump-offloads" with the same meaning.

lhutton1 · 2022-11-29T17:06:09Z

Thanks for the updates @sergey-grovety @arina-grovety, looks great! I started reviewing this evening but didn't fully get through it, will pick up where I left off tomorrow

lhutton1

Apologies for the delay, I left some comments below, see what you think. Thanks for updates, its looking much better!

python/tvm/driver/tvmc/compiler.py

python/tvm/relay/analysis/operations_distribution.py

python/tvm/relay/frontend/common.py

lhutton1 · 2022-12-01T10:45:52Z

python/tvm/driver/tvmc/compiler.py

+    ----------
+    mod : tvm.ir.IRModule
+        The IRModule that gets generated from a relay frontend.
+    initial_relay_astext : list


Perhaps I missed it, what's the reason for parsing the initial Relay as a string, rather than traversing a copy of the IRModule?

Hi @lhutton1,
this was done to avoid copying entities that we consider unnecessary, since only the text representation of the Relay is used in this function

I was thinking that traversing the Relay IR itself here might simplify the logic below and decouple the implementation from the textual representation making it more robust to changes in the future. It seems like it would also remove the need to make changes such as https://github.com/apache/tvm/pull/13212/files#diff-237c52e4e68362990738b47cc97c81b5c84ec92dfbcb672e961f0e9887f436c0R378 which might require more motivation from the community. WDYT?

cc @ashutosh-arm @ekalda incase you have any other suggestions

I would suggest the same thing as @lhutton1 did above. Text representation changes quite often. It is better to rely on the information available inside the module object and extract it using let's say ExprVisitor.

Hello @ashutosh-arm,
sorry, there is a non-fixed comment,

now we pass the initial Relay as Relay IR itself, and then use "annotate" parameter of the astext() function
to add the desired annotations to the generated text, and then parsing our annotations from the formed text.

I will fix the comment string in the update to the PR.

Apologies if there might have been some confusion here, this question was more around the need to search over the relay IR as text for the compiler name, op name, func id, etc. The information could be extracted using a visitor pass (ExprVisitor) that traverses the IR, making it more resilient to changes in the text format of the IR. Since this method is working and to move this forwards, we can pull this out into a separate follow-up

tests/python/driver/tvmc/test_compiler.py

python/tvm/relay/analysis/operations_distribution.py

lhutton1

Apologies for the delay and thanks for the reminder. I think it's getting close, thanks for the hard work on this. I think my biggest concern still lies in extracting relevant information from the textual representation of the relay as it seems a bit more fragile. Is there a reason for doing it like this? Otherwise LGTM!

python/tvm/driver/tvmc/compiler.py

python/tvm/relay/frontend/common.py

python/tvm/relay/analysis/operations_distribution.py

tests/python/contrib/test_ethosu/test_pass_operations_distribution.py

…pan field of Call.

sergio-grovety · 2023-03-20T07:49:37Z

@tvm-bot rerun

tests/python/contrib/test_ethosu/test_pass_operations_distribution.py

chunit-quic

Span suffix part looks good to me. Thanks for help. :D

lhutton1

Thanks for updates on this @arina-grovety, @sergio-grovety, I just had one nit which I think was missed previously, otherwise LGTM. Thanks for the support with the spans @chunit-quic!

tests/python/driver/tvmc/test_compiler.py

python/tvm/relay/analysis/operations_distribution.py

…perations_distribution.py. Fix tflite import in tests/python/contrib/test_ethosu/infra.py

arina-grovety · 2023-03-25T08:29:50Z

@tvm-bot rerun

lhutton1 · 2023-03-25T14:11:36Z

@tvm-bot rerun

arina-grovety · 2023-03-27T07:18:30Z

@tvm-bot rerun

Hello @lhutton1, thank you!

lhutton1

LGTM!

lhutton1 · 2023-03-27T19:28:33Z

Thanks @sergio-grovety @arina-grovety @chunit-quic @ashutosh-arm! This will be very helpful for users wanting to see how their models were offloaded, thanks for persisting with all the changes!

lhutton1 reviewed Nov 1, 2022

View reviewed changes

src/relay/backend/contrib/ethosu/compiler_attrs.cc Outdated Show resolved Hide resolved

sergio-grovety force-pushed the tvmc-ethosu-dump-npu-functions-coverage-option branch from 623aa1f to e9e8a68 Compare November 28, 2022 14:23

lhutton1 reviewed Dec 1, 2022

View reviewed changes

sergio-grovety force-pushed the tvmc-ethosu-dump-npu-functions-coverage-option branch 4 times, most recently from 44de505 to 325b2ef Compare January 23, 2023 06:34

sergio-grovety requested a review from lhutton1 January 27, 2023 11:44

lhutton1 reviewed Jan 27, 2023

View reviewed changes

sergio-grovety force-pushed the tvmc-ethosu-dump-npu-functions-coverage-option branch 2 times, most recently from 68ede35 to d603dc9 Compare March 17, 2023 14:20

Added a separate optional pass _SuffixTagger to add suffixes to the s…

afe244b

…pan field of Call.

sergio-grovety force-pushed the tvmc-ethosu-dump-npu-functions-coverage-option branch from d603dc9 to afe244b Compare March 19, 2023 18:55

sergio-grovety requested review from chunit-quic and lhutton1 and removed request for chunit-quic and lhutton1 March 20, 2023 13:27

chunit-quic reviewed Mar 23, 2023

View reviewed changes

tests/python/contrib/test_ethosu/test_pass_operations_distribution.py Outdated Show resolved Hide resolved

chunit-quic reviewed Mar 23, 2023

View reviewed changes

lhutton1 reviewed Mar 23, 2023

View reviewed changes

tests/python/driver/tvmc/test_compiler.py Show resolved Hide resolved

python/tvm/relay/analysis/operations_distribution.py Show resolved Hide resolved

Add general test case to tests/python/contrib/test_ethosu/test_pass_o…

f8dbdb2

…perations_distribution.py. Fix tflite import in tests/python/contrib/test_ethosu/infra.py

sergio-grovety requested review from lhutton1 and removed request for lhutton1 March 27, 2023 13:01

lhutton1 approved these changes Mar 27, 2023

View reviewed changes

lhutton1 merged commit da83353 into apache:main Mar 27, 2023

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U #13212

[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U #13212

sergio-grovety commented Oct 27, 2022 •

edited

Loading

tvm-bot commented Oct 27, 2022 •

edited

Loading

sergio-grovety commented Oct 31, 2022

lhutton1 left a comment •

edited

Loading

ashutosh-arm commented Nov 2, 2022 •

edited

Loading

arina-grovety commented Nov 9, 2022

arina-grovety commented Nov 9, 2022 •

edited

Loading

ashutosh-arm commented Nov 10, 2022

arina-grovety commented Nov 10, 2022

ashutosh-arm commented Nov 11, 2022 •

edited

Loading

lhutton1 commented Nov 11, 2022 •

edited

Loading

arina-grovety commented Nov 28, 2022

lhutton1 commented Nov 29, 2022

lhutton1 left a comment

lhutton1 Dec 1, 2022

arina-grovety Dec 2, 2022

lhutton1 Dec 5, 2022

ashutosh-arm Jan 27, 2023

arina-grovety Jan 27, 2023

lhutton1 Mar 23, 2023

lhutton1 left a comment

sergio-grovety commented Mar 20, 2023

chunit-quic left a comment

lhutton1 left a comment

arina-grovety commented Mar 25, 2023

lhutton1 commented Mar 25, 2023

arina-grovety commented Mar 27, 2023

lhutton1 left a comment

lhutton1 commented Mar 27, 2023

[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U #13212

[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U #13212

Conversation

sergio-grovety commented Oct 27, 2022 • edited Loading

Usage

Example output:

Usage

Example output:

tvm-bot commented Oct 27, 2022 • edited Loading

sergio-grovety commented Oct 31, 2022

lhutton1 left a comment • edited Loading

Choose a reason for hiding this comment

ashutosh-arm commented Nov 2, 2022 • edited Loading

arina-grovety commented Nov 9, 2022

arina-grovety commented Nov 9, 2022 • edited Loading

ashutosh-arm commented Nov 10, 2022

arina-grovety commented Nov 10, 2022

ashutosh-arm commented Nov 11, 2022 • edited Loading

lhutton1 commented Nov 11, 2022 • edited Loading

arina-grovety commented Nov 28, 2022

lhutton1 commented Nov 29, 2022

lhutton1 left a comment

Choose a reason for hiding this comment

lhutton1 Dec 1, 2022

Choose a reason for hiding this comment

arina-grovety Dec 2, 2022

Choose a reason for hiding this comment

lhutton1 Dec 5, 2022

Choose a reason for hiding this comment

ashutosh-arm Jan 27, 2023

Choose a reason for hiding this comment

arina-grovety Jan 27, 2023

Choose a reason for hiding this comment

lhutton1 Mar 23, 2023

Choose a reason for hiding this comment

lhutton1 left a comment

Choose a reason for hiding this comment

sergio-grovety commented Mar 20, 2023

chunit-quic left a comment

Choose a reason for hiding this comment

lhutton1 left a comment

Choose a reason for hiding this comment

arina-grovety commented Mar 25, 2023

lhutton1 commented Mar 25, 2023

arina-grovety commented Mar 27, 2023

lhutton1 left a comment

Choose a reason for hiding this comment

lhutton1 commented Mar 27, 2023

sergio-grovety commented Oct 27, 2022 •

edited

Loading

tvm-bot commented Oct 27, 2022 •

edited

Loading

lhutton1 left a comment •

edited

Loading

ashutosh-arm commented Nov 2, 2022 •

edited

Loading

arina-grovety commented Nov 9, 2022 •

edited

Loading

ashutosh-arm commented Nov 11, 2022 •

edited

Loading

lhutton1 commented Nov 11, 2022 •

edited

Loading