Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ConvertLayout] Support QNN ops. #5066

Merged
merged 4 commits into from
Mar 19, 2020
Merged

Conversation

anijain2305
Copy link
Contributor

Recently introduced Op strategy has disabled conversion from NHWC to NCHW in AlterOpLayout (which is correct thing to do). We can solve this problem by calling ConvertLayout in the parser if needed. However, this only works for FP32.

For quantized models, parsers give a QNN graph. And this QNN graph goes to relay.build. Relay build internally calls QNN Legalize passes to convert it to Relay-only ops. The problem is ConvertLayout does not work on QNN ops. Therefore, even if we call ConvertLayout after parser, the layouts will not change.

This PR implements ConvertLayout for QNN ops. In addition, I have changed the interface of FInferCorrectLayout to ingest an array of Relay Types instead of shapes. This is helpful in operators like Concatenate where we need to know the number of input data tensors.

@icemelon9 @zhiics @yzhliu

@anijain2305 anijain2305 force-pushed the qnn_layout branch 4 times, most recently from 235c079 to a4c5092 Compare March 15, 2020 07:54
@anijain2305 anijain2305 marked this pull request as ready for review March 16, 2020 02:41
@anijain2305
Copy link
Contributor Author

@icemelon9 @zhiics @yzhliu @tqchen

Let me know what you think about this.

Copy link
Member

@yzhliu yzhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good to me.


// Fill the layouts of remaining input tensors - scales and zero points. The layouts of these
// tensors can be ignored as they dont go through any transformation.
Layout ignore_layout = Layout("I");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are them always input channel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can be scalar, or output channel. I initially thought of putting them as "C". But, chose "I" to be more specific. I am open to discuss.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "C" is better. I don't have strong opinion though

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anijain2305 anijain2305 merged commit 38118be into apache:master Mar 19, 2020
@anijain2305
Copy link
Contributor Author

Thanks @zhiics @yzhliu This is merged

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020
* [ConvertLayout] Support QNN ops.

* Changing layouts to C.

* Fixing dilation.

* Empty commit.

Co-authored-by: Ubuntu <ubuntu@ip-172-31-53-55.us-west-2.compute.internal>
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020
* [ConvertLayout] Support QNN ops.

* Changing layouts to C.

* Fixing dilation.

* Empty commit.

Co-authored-by: Ubuntu <ubuntu@ip-172-31-53-55.us-west-2.compute.internal>
shoubhik added a commit to shoubhik/incubator-tvm that referenced this pull request May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants