[Topi & Relay] Add quantization support for the vision transform model in GPU #7814

huochaitiantang · 2021-04-09T11:34:25Z

We submit this PR to add quantization support for the vision transform (vit) model in GPU. The main change is as follows:

1, In vit model, time-consuming operators are batch_matmul, so we first add the compute and schedule of batch_matmul_int8.cuda in tvm.topi.cuda

2, To support the quantization of batch_matmul, we then add batch_matmul_rewrite and BatchMatmulRealize in tvm.relay.quantize

3, The kl -divergence calibrate could not preserve the accuracy of vit model well, so we add the _percentile_scale function

For the vit-B32-224 model, the performance is as follows:

Top-1 accuracy in Imagenet validation
- paper: 73.38
- nonofficial-model-fp32: 73.27
- nonofficial-model-int8: 72.78
The latency in GTX1660 GPU
- nonofficial-model-fp32: 10.32 ms
- nonofficial-model-int8: 4.93 ms

Thanks for your review! @jcf94 @tqchen

XHPlus · 2021-04-09T11:51:53Z

Thanks for the reviewer. We will keep updating more results for other ViT models and contributing more quantization calibration algorithms.

jcf94

Thanks! @huochaitiantang
This PR looks great! I have only few comments.

python/tvm/relay/op/strategy/cuda.py

jcf94 · 2021-04-11T07:59:59Z

tests/python/nightly/quantization/test_quantization_accuracy_for_vit.py

+    if not os.path.exists(logfile):
+        os.system("wget https://github.com/TheGreatCold/tvm-vit/raw/master/{}".format(logfile))
+    if not os.path.exists(onnx_path):
+        os.system("wget https://github.com/TheGreatCold/tvm-vit/raw/master/{}".format(onnx_path))


As a unit test, I'm thinking that this may not be so good to involve a resource outside. (Network problem or changes to the tvm-vit repo may break the UT. At least use git commit instad of branch like: https://github.com/TheGreatCold/tvm-vit/raw/d2aa1e60eef42e2fdedbd1e13aa85ac5faf0a7fc/vit_B32_224.onnx will be better)
I'm not sure if there's any better solution for this. @tqchen Do you have any suggestion?

Thanks for your review. We updated the download codes based on your suggestion. Besides, the wget method is not compatible on different platforms, so we use the urllib library instead.

jcf94

Thanks! @huochaitiantang @XHPlus . Have fun with TVM. Looking forward to see more contributions from you!

…l in GPU (apache#7814) * Add cuda batch matmul int8 support for quantized vit model * Fix for combine parallel pass with dense and batch_matmul * Reformat based on lint * Add plevel & update the file download method

jcf94 requested changes Apr 11, 2021

View reviewed changes

huochaitiantang added 5 commits April 12, 2021 09:46

Add cuda batch matmul int8 support for quantized vit model

202bf75

Fix for combine parallel pass with dense and batch_matmul

8191e59

Reformat based on lint

762b258

Add plevel & update the file download method

8e9f7b2

Merge branch 'main' into batch_matmul_quantize

79a8559

jcf94 approved these changes Apr 13, 2021

View reviewed changes

jcf94 merged commit 90dce48 into apache:main Apr 14, 2021

wyc-ruiker mentioned this pull request Jul 5, 2021

[CUDA] dense_tensorcore/batch_matmul_tensorcore support int8/int4 #8402

Merged

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Topi & Relay] Add quantization support for the vision transform model in GPU #7814

[Topi & Relay] Add quantization support for the vision transform model in GPU #7814

huochaitiantang commented Apr 9, 2021

XHPlus commented Apr 9, 2021

jcf94 left a comment

jcf94 Apr 11, 2021

huochaitiantang Apr 12, 2021

jcf94 left a comment

[Topi & Relay] Add quantization support for the vision transform model in GPU #7814

[Topi & Relay] Add quantization support for the vision transform model in GPU #7814

Conversation

huochaitiantang commented Apr 9, 2021

XHPlus commented Apr 9, 2021

jcf94 left a comment

Choose a reason for hiding this comment

jcf94 Apr 11, 2021

Choose a reason for hiding this comment

huochaitiantang Apr 12, 2021

Choose a reason for hiding this comment

jcf94 left a comment

Choose a reason for hiding this comment