-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TVMC][TRANSFORMS] ToMixedPrecision transform support with custom options enabled #14010
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @srkreddy1238 this looks very interesting. Can you improve the PR description with an example command on how it looks like?
Can you also add a few tests to make sure this don't regress in future? Thanks.
@srkreddy1238 is there a reason you're not doing this via something like |
@Mousius thanks for the review. Initially I thought of this approach where I can create an external codegen and extend the
This way tvmc calls the pass_pipeline and we can do what ever we want over the module. These hooks are essentially small personalization over existing transforms. Like in this example we tried to customize the Mixed Precision pass for Adreno. Essentially we are helping developers with these simple reusable utilities. These hooks are optional features on the existing target and they may grow over time. Creating multiple targets for each such feature results in multiple similar targets definitions. I thought hooks may be a better approach to keep them away from the existing real external codegens. What do you think ? |
I think there's two different thoughts: Rather than wrapping this in
Else, do you agree it's a bit odd to have to essentially build your ML pipeline in CLI arguments? The other thing is that you mentioned this being reusable utility, I like that approach more, and I'm curious whether we can add a |
c90657b
to
8678f72
Compare
Adreno currently share I like the idea of tvmc directly supporting Not sure about Coming to post build hooks, the intension here is to accommodate altering the compiled library module. One of the use case being importing additional modules before export/save. Any thoughts here ? |
8678f72
to
2bc0763
Compare
Another thought here, How about using packed functions instead of inventory of hooks ? Hook is defined in user application or the respective contrib as
Invoked as This can be good starting point where |
4b1a20f
to
938f561
Compare
The advice I'd give here is that is that we don't always need a generic solution to a specific problem, the
What is the use-case for this? I can't think of a use-case for generically bundling things into the eventual output unless I'm missing something, specific
Sounds a bit dangerous to me, generically calling a packed function against a module also means you have access to a plethora of other registered functions - do we want to expose them all to the audience of |
To implement the complete functionality of mixed precision we need passing the list of ops to be converted to mixed precision, precision type and other configurable options for mixed precision pass. All these are use case dependent. If we are good with enabling mixed-precision pass with it's configurable options, I am good to go this way.
My use case is about embedding additional tuning cache and target precompiled OpenCL binary source into the module itself. These tuning cache and clBinary blobs will be generated by running the compiled modules on real device via RPC. I don't think we now encourage this way from the core compilation flow itself. IMHO, such first time requirements should enabled outside core compiler and later absorbed if there are more similar use cases.
From a newbie point of view, option We are talking about only two options
As I see, we have two perspectives here to consider. One being |
I think this makes sense, a naive first attempt is something like this?
That definitely looks specific to OpenCL and loading precompiled OpenCL? Are all of these modules generated ahead of TVM compilation? I don't think it necessarily has to be deep in the core compiler, but it would make sense for such options to live near
One of the reasons I don't think we're disagreeing on the way forwards for mixed precision though? Shall we try and get that resolved and look to the OpenCL blobs at the same time? |
Thanks for your time on this @Mousius Yes, we are good with OpenCL blobs feature will have major change in TVM compiler and TVMC changes is cherry on the top. We can revisit with an RFC and more details. |
a9ac14b
to
4b62c12
Compare
4b62c12
to
c09ed29
Compare
…ions enabled Adds new command line options --mixed-precision --mixed-precision-ops --mixed-precision-input --mixed-precision-output and --desired-layout-ops This PR also enhances the python interface by replacing alter_layout to transform_args. transform_args is a dict with all tranform related options including existing desired_layout or alter_layout option.
c09ed29
to
670c523
Compare
Probably I missed the initial proposal and don't feel that I fully understand proposal
If we add such feature dedicated only to Adreno, we will have to extend it to each above target. That would be very undesirable. |
@elvin-n The discussion on this PR had two contexts a while back. Pre compile hooks (Primarily to support FP16) & post build hook (to allow additional processing on compiled module - To be specific importing additional modules like OpenCL cache/binary programs ...etc.). Finally, we decided to confine this PR to discuss about FP16 support and now the proposal is to generically support for any target (which is inline with your recommendation :)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really cool @srkreddy1238 😸 hopefully only a few things to resolve and we can this feature in!
1949d26
to
e55faaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, pending linting - @leandron could you take another look?
6d54140
to
b0379b4
Compare
@leandron can you take a look on this ? I have some dependencies for TVMCon next week. |
Comments addressed, given sufficient time to re-review, should there be an issue let me know :-)
this aims to make the `--desired-layout` argument more powerful based on the previously merged changes from #14010 by introducing two new features: 1. Allow passing multiple arguments to `--desired-layout` instead of only one, to specify one layout per transformed operator specified in `--desired-layout-ops`. (Number of arguments has to bei either 1 or match the number of transformed operators) 2. Optionally, you can now specify a non-default kernel layout as follows: `NHWC:HWIO` Example Usage: `tvmc compile … --desired-layout nn.max_pool2d qnn.conv2d --desired-layout-ops NCHW NHWC:HWIO` I also added unit tests for the new use-cases. ### Known Limitations: * It would make sense to specify individual kernel layouts for regular convolutions and depthwise ones. However since both are usually implemented as generalized `nn.conv2d`, we can not transform them individually. Are there any good workarounds for this? * The arguments of `--desired-layouts` have previously been checked for validity during cmdline parsing (e.g. only NCHW and NHWC are allowed) which is not possible anymore. Should I add a regular expression for that?
Adds new command line options:
--mixed-precision
- Enable mixed precision conversion--mixed-precision-ops
- List of operators to be converted to mixed precision--mixed-precision-calculation-type
- Calculation precision type--mixed-precision-acc-type
- Accumulator precision typeAnd:
--desired-layout-ops
- The list of operators to be transformed with desired layout.