[Torch, QNN] Add support for quantized models via QNN #4977

masahi · 2020-03-02T05:18:31Z

This adds support for converting quantized models from torch to Relay via QNN. I believe this makes a good case for introducing Torch frontend to TVM, since the quantized Torch models -> ONNX path is not supported at the moment (or any time soon). Also torch cannot run quantized models on GPU yet, but we can run quantized models on CUDA very fast.

So far I've tested on quantized models from torchvision available at https://github.com/pytorch/vision/tree/master/torchvision/models/quantization (shufflenet is removed since it has issues with tracing), and a custom mobilenet v3 model.

Per channel quantization is fully supported, and it is a big win on mobilenet v3 (more than 10 point accuracy on imagenet). I think even TFLite or MXNet + QNN doesn't support per channel quantization end to end.

The PR contains

The converter to QNN
Tests for individual quantized modules
Tests on pretrained imagenet models with dummy calibration

Verifying the result is tricky due to difference in numerics between torch and tvm.

Quantized modules tests are run but there is no assertion on how close the raw floating point output is to torch. I dumped some statistics that we can eye ball and verify that the result is reasonable.

For imagenet tests, I use the real image and verify that two top3 label sets are identical. Here, I'm ignoring the order among top3, but for the 4 models that I added in the test, the order is actually correct. The order is ignored to make CI less fragile.

Below is the result on 1k subset of imagenet val data with full calibration + per channel quantization. I've prepared a repo to do the same evaluation here. You can also use the full imagenet dataset, if you have an access to it.

model name	Torch-Top1	Torch-Top5	TVM-Top1	TVM-Top5
resnet18	69.7	90.2	70.0	90.1
resnet50	78.1	94.1	77.7	93.7
inception_v3	69.8	89.5	68.9	89.2
googlenet	71.8	90.5	71.6	90.6
mobilenet_v2	69.9	89.4	71.0	89.4
mobilenet_v3	63.5	83.6	62.7	83.2

cc @anijain2305 @vinx13 @FrozenGene @jwfromm @alexwong @ajtulloch @hlu1 @yinghai

masahi · 2020-03-02T08:50:38Z

Test passed!!

jwfromm · 2020-03-02T17:53:59Z

tests/python/frontend/pytorch/qnn_test.py

+        mean_abs_diff = np.mean(np.abs(tvm_result - pt_result))
+        num_identical = np.sum(tvm_result == pt_result)
+
+        print("\nModel name: %s" % model_name)


Having some sort of assert here instead of prints would be nice. Is there some reasonable upper bound on the diffs that we could check?

Yeah, this is my pain point :) I added a sample output in a6239e1

The problem is how close the raw floating point output is to torch differs widely among different network and which of per tensor/per channel quantization is used. I could add something like max_abs_diff < 2.5, but I don't think that is too helpful. I think argmax is quite reasonable, though.

I believe there is a room for improvement in terms of making the result closer to torch. Also quantization in general is WIP on torch as well. There are some quantized ops that piggy backs to fp32 by dequantize -> fp32 op -> quantize, which defeats the purpose of doing quantization (going faster than fp32). We may need to talk to torch folks on this.

I want to say this is an initial support for torch quantization by someone who is new to the quantization space, and that by making it public I am hoping that people can improve on it :)

cc @anijain2305

anijain2305

Overall LGTM with minor comments. I think it might be helpful to add little more comments. Just a soft suggestion, feel free to ignore.

python/tvm/relay/frontend/pytorch.py

python/tvm/relay/frontend/qnn_torch.py

masahi · 2020-03-02T21:14:08Z

@anijain2305 Where do you want to see more comments? Happy to add them.

I think the most non-obvious aspect of the code is the need to do some surgery on an input graph, to make all input and output quant params visible on the op inputs. Is that part clear?

masahi · 2020-03-02T22:57:34Z

@anijain2305 Added more comments, I think it is enough for now.

anijain2305

LGTM

anijain2305 · 2020-03-02T23:45:40Z

Will wait for a day or two for others to review.

masahi · 2020-03-04T18:32:05Z

@anijain2305 I think it is a good time to land it.

anijain2305 · 2020-03-04T19:25:16Z

Thanks @masahi @jwfromm This is merged!

masahi · 2020-03-04T20:37:14Z

Hmm for some reason, this commit became @anijain2305' commit.
https://github.com/apache/incubator-tvm/commits?author=anijain2305

Also see the commit log. It seems the way github thinks about who is the commit author has changed since today. Have we changed some settings or is this github bug? @tqchen

greysteil · 2020-03-05T15:59:24Z

FYI, this issue (the change in commit author) got escalated to me at GitHub. We have a bug in our squash and merge logic right now (introduced yesterday) which causes the original PR author to be removed from the list of commit co-authors in some cases. We're working on a fix now.

…#4977)" This reverts commit fc7f078.

tqchen · 2020-03-09T20:13:51Z

This PR is among one of the PRs affected by the github squash commit bug. We take every contribution serious in the TVM community. The community has decided to use revert/redo approach to amend the contributions as per #5015

#5013) This reverts commit fc7f078.

* qnn support initial import * fix upsampling num input * imagenet tests added * add qunatized module tests * quantized module tests working * imagenet test working * fix lint * remove top level torch import to fix ci error * disable lint warning on outside toplevel import * revert parse -> convert change * add comments to qnn translation * address comments, add sample outputs * add more comments * refactor bias add and requantize step

…#4977)" (apache#5013) This reverts commit fc7f078.

* qnn support initial import * fix upsampling num input * imagenet tests added * add qunatized module tests * quantized module tests working * imagenet test working * fix lint * remove top level torch import to fix ci error * disable lint warning on outside toplevel import * revert parse -> convert change * add comments to qnn translation * address comments, add sample outputs * add more comments * refactor bias add and requantize step

…#4977)" (apache#5013) This reverts commit fc7f078.

masahi added 7 commits March 1, 2020 21:00

qnn support initial import

29e8644

fix upsampling num input

e7df8ec

imagenet tests added

f5a319f

add qunatized module tests

93f374a

quantized module tests working

0615ca6

imagenet test working

da89492

fix lint

0daf4ca

masahi force-pushed the torch-qnn branch from 5be737e to 0daf4ca Compare March 2, 2020 05:28

masahi added 3 commits March 2, 2020 14:36

remove top level torch import to fix ci error

6a99ec9

disable lint warning on outside toplevel import

96ccb37

revert parse -> convert change

75c1200

add comments to qnn translation

9c9556d

masahi assigned anijain2305 Mar 2, 2020

jwfromm approved these changes Mar 2, 2020

View reviewed changes

anijain2305 reviewed Mar 2, 2020

View reviewed changes

python/tvm/relay/frontend/pytorch.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/qnn_torch.py Outdated Show resolved Hide resolved

address comments, add sample outputs

a6239e1

masahi force-pushed the torch-qnn branch from 7b9664a to a6239e1 Compare March 2, 2020 21:55

add more comments

d33ac99

anijain2305 approved these changes Mar 2, 2020

View reviewed changes

refactor bias add and requantize step

80b1a61

anijain2305 merged commit fc7f078 into apache:master Mar 4, 2020

anijain2305 added a commit to anijain2305/tvm that referenced this pull request Mar 9, 2020

Revert "[Torch, QNN] Add support for quantized models via QNN (apache…

f1feda8

…#4977)" This reverts commit fc7f078.

tqchen mentioned this pull request Mar 9, 2020

[DEV] Amend Github Attribution Bug #5015

Closed

5 tasks

anijain2305 added a commit that referenced this pull request Mar 9, 2020

Revert "[Torch, QNN] Add support for quantized models via QNN (#4977)" (

f346c60

#5013) This reverts commit fc7f078.

masahi mentioned this pull request Mar 16, 2020

All quantized ops should take scalar_type as an argument pytorch/pytorch#34351

Closed

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020

Revert "[Torch, QNN] Add support for quantized models via QNN (apache…

2113b9e

…#4977)" (apache#5013) This reverts commit fc7f078.

zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020

Revert "[Torch, QNN] Add support for quantized models via QNN (apache…

04d9310

…#4977)" (apache#5013) This reverts commit fc7f078.

ZihengJiang mentioned this pull request Sep 17, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Torch, QNN] Add support for quantized models via QNN #4977

[Torch, QNN] Add support for quantized models via QNN #4977

masahi commented Mar 2, 2020 •

edited

Loading

masahi commented Mar 2, 2020

jwfromm Mar 2, 2020

masahi Mar 2, 2020 •

edited

Loading

anijain2305 left a comment

masahi commented Mar 2, 2020

masahi commented Mar 2, 2020

anijain2305 left a comment

anijain2305 commented Mar 2, 2020

masahi commented Mar 4, 2020

anijain2305 commented Mar 4, 2020

masahi commented Mar 4, 2020 •

edited

Loading

greysteil commented Mar 5, 2020

tqchen commented Mar 9, 2020

[Torch, QNN] Add support for quantized models via QNN #4977

[Torch, QNN] Add support for quantized models via QNN #4977

Conversation

masahi commented Mar 2, 2020 • edited Loading

masahi commented Mar 2, 2020

jwfromm Mar 2, 2020

Choose a reason for hiding this comment

masahi Mar 2, 2020 • edited Loading

Choose a reason for hiding this comment

anijain2305 left a comment

Choose a reason for hiding this comment

masahi commented Mar 2, 2020

masahi commented Mar 2, 2020

anijain2305 left a comment

Choose a reason for hiding this comment

anijain2305 commented Mar 2, 2020

masahi commented Mar 4, 2020

anijain2305 commented Mar 4, 2020

masahi commented Mar 4, 2020 • edited Loading

greysteil commented Mar 5, 2020

tqchen commented Mar 9, 2020

masahi commented Mar 2, 2020 •

edited

Loading

masahi Mar 2, 2020 •

edited

Loading

masahi commented Mar 4, 2020 •

edited

Loading