[QNN] Change default rounding to UPWARD. #4131

anijain2305 · 2019-10-15T21:56:34Z

This PR is to change the default rounding from TONEAREST to UPWARD. QNN ops are converted to a sequence of Relay operator while compilation. TONEAREST rounding leads to many more relay operators as shown here

https://github.com/dmlc/tvm/blob/11a3a777ec78ad81ce12a940d07521870f546ad1/src/relay/qnn/util.cc#L113-L124

This PR makes a case for the default rounding option to be UPWARD.

I ran all TFLite hosted Inception Quantized models on a Cascade Lake machine with these two rounding options. I am giving the comparison numbers that show that UPWARD rounding is overall better alternative.

Accuracy Comparison

Accuracy is same regardless of the rounding choice. Tested over 50k images with pre-processing.

Model	Upward	Tonearest
Inception V1	69.592/89.456	69.592/89.456
Inception V2	73.332/91.328	73.332/91.328
Inception V3	77.306/93.618	77.306/93.618
Inception V4	79.55/94.196	79.55/94.196

Latency Comparison

Because TONEAREST leads to many more operators, the observed latency is higher.

Model	Upward (ms)	Tonearest (ms)
Inception V1	2.13497	2.97504
Inception V2	8.96444	11.89356
Inception V3	6.11635	9.13942
Inception V4	12.24914	16.80568

Compilation Time Comparison

ToNearest has very high compilation time. Though not included, it blows up drastically for mobilenet V1.

Model	Upward (seconds)	Tonearest (seconds)
Inception V1	80	113
Inception V2	53	55
Inception V3	67	80
Inception V4	64	75

Memory Footprint

Though I don't have numbers for that yet. @shoubhik also pointed out that memory footprint for the params increased with TONEAREST. This is mostly because TONEAREST introduces some more constants in the network, and they are in int64.

Background

Background about QNN rounding choices
https://github.com/dmlc/tvm/blob/11a3a777ec78ad81ce12a940d07521870f546ad1/include/tvm/relay/qnn/attrs.h#L52-L61

The point is that TONEAREST and UPWARD differ at only negative mid values (like -1.5, -2.5 and so on). This difference is invisible in end-to-end accuracy.

What does other frameworks do?

Every toolchain has their own favorite rounding choice. There is no widely chosen in-production standard AFAIK.

TFLite - Chooses ARM based rounding. It works back from ARM assembly instructions, and very close to TONEAREST, but not exactly TONEAREST.
MKLDNN - RoundNearestEven as per MKLDNN doc. This is not TONEAREST rounding.

Note that this PR is only changing the default value, one can still choose any rounding option if one wants. This is for TVM user, who does not want to worry about different roundings.

@tqchen @jackwish @vinx13 @FrozenGene @u99127 @zhiics @yzhliu

tmoreau89

Thank you for the thorough evaluation @anijain2305 ! Interesting results in terms of accuracy/performance tradeoffs. Let's just check with @ZihengJiang whether this would affect the quantization nightly runs.

zhenhuaw-me

Impressive! Thanks for the solid data! It's very interesting that the rounding policy impacts performance so much.

zhiics

Thanks. This looks great!

zhiics · 2019-10-16T05:18:01Z

Thanks everyone. This is now merged.

zhiics · 2019-10-16T05:18:53Z

@tmoreau89 Sorry. I didn't see your comment above. Should I revert?

anijain2305 · 2019-10-16T05:48:31Z

This is QNN stuff. So, it would not affect automatic quantization nightly runs. So, this should be fine.

tmoreau89 · 2019-10-16T16:29:54Z

@zhiics no problem, no need to revert

* master: (51 commits) [QNN][TFLite] Parsing QNN Add op. Adding MobilenetV2. (apache#4142) [CI] Pin NNPack pthreadtools version (apache#4152) Fix typo (apache#4144) [Relay][Frontend][TF] Add tensor array ops (apache#3798) [relay][vm] Separate VM runtime with executable (apache#4100) [PATCH] Fix undefined __floatdihf in libtvmruntime.so on aarch64. (apache#4119) [DOCKER] Pin torchvision==0.4.1 (apache#4140) [TOPI][x86] Cascade lake support. (apache#4123) [Relay] Improve build error when no lowered funcs are produced (apache#4132) [RUNTIME] Refactor object python FFI to new protocol. (apache#4128) Update PULL_REQUEST_TEMPLATE.md Adding support for dequantizing from int32 to float32. (apache#4130) [Relay][Training] Add and fix gradients (apache#4126) [QNN] Change default rouning to UPWARD. (apache#4131) Fix infer type of kernel in dense. (apache#4125) [Relay][AlterOpLayout] NHWC to NCHWc pad operator. (apache#4103) [ARITH] Fix lowering of floormod(x, y) != 0 (apache#4127) [RFC][RUNTIME] Introduce new object protocol. (apache#4115) [Relay][Topi] Disable conv NHWC pack int8. (apache#4038) Update task_cpp_unittest.sh ...

[QNN] Change default rouning to UPWARD.

6ec8aa4

tmoreau89 reviewed Oct 15, 2019

View reviewed changes

vinx13 approved these changes Oct 15, 2019

View reviewed changes

FrozenGene approved these changes Oct 16, 2019

View reviewed changes

zhenhuaw-me approved these changes Oct 16, 2019

View reviewed changes

zhiics approved these changes Oct 16, 2019

View reviewed changes

zhiics merged commit 1c0e743 into apache:master Oct 16, 2019

anijain2305 added a commit to anijain2305/tvm that referenced this pull request Oct 17, 2019

[QNN] Change default rouning to UPWARD. (apache#4131)

a0fc243

wweic pushed a commit to neo-ai/tvm that referenced this pull request Oct 18, 2019

[QNN] Change default rouning to UPWARD. (apache#4131)

2a79b98

anijain2305 deleted the rounding branch November 13, 2019 00:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN] Change default rounding to UPWARD. #4131

[QNN] Change default rounding to UPWARD. #4131

anijain2305 commented Oct 15, 2019 •

edited

Loading

tmoreau89 left a comment

zhenhuaw-me left a comment

zhiics left a comment

zhiics commented Oct 16, 2019

zhiics commented Oct 16, 2019

anijain2305 commented Oct 16, 2019 •

edited

Loading

tmoreau89 commented Oct 16, 2019

[QNN] Change default rounding to UPWARD. #4131

[QNN] Change default rounding to UPWARD. #4131

Conversation

anijain2305 commented Oct 15, 2019 • edited Loading

Accuracy Comparison

Latency Comparison

Compilation Time Comparison

Memory Footprint

Background

What does other frameworks do?

tmoreau89 left a comment

Choose a reason for hiding this comment

zhenhuaw-me left a comment

Choose a reason for hiding this comment

zhiics left a comment

Choose a reason for hiding this comment

zhiics commented Oct 16, 2019

zhiics commented Oct 16, 2019

anijain2305 commented Oct 16, 2019 • edited Loading

tmoreau89 commented Oct 16, 2019

anijain2305 commented Oct 15, 2019 •

edited

Loading

anijain2305 commented Oct 16, 2019 •

edited

Loading