Fixed point multiplication improvements for AArch64 #5980

giuseros · 2020-07-02T09:09:28Z

RFC

This PR is based on the following RFC: https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm

High level description of the submission

The idea is to create a TIR intrinsic fixed_point_multiply that can be overloaded (by the means of tvm.target.intrin.register_intrin_rule) and use AArch64 intrinsics (other vendors can provide their own implementation)

The TIR default intrinsic is registered in: tvm/src/target/intrin_rule.cc
The overload for AArch64 lives in tvm/topi/python/topi/arm_cpu/tensor_intrin.py

giuseros · 2020-07-02T09:10:55Z

cc @tqchen @anijain2305 @kparzysz-quic

giuseros · 2020-07-08T16:58:49Z

Following the discussion in the RFC I redesigned the submission by introducing a general qmuls intrinsic:

Relay uses qmuls to implement fixed_point_multiply
The new intrinsic is more general, as it accepts the Q-ness of the input values to be multiplied

The structure of the submission remains the same.

giuseros · 2020-07-13T09:41:01Z

@anijain2305 @kparzysz-quic @tqchen Any update on this?

include/tvm/tir/op.h

python/tvm/tir/op.py

src/target/intrin_rule.cc

tmoreau89

Thank you for your contribution @giuseros, overall the changes look good. It would be great to add one or two unit tests on the tensor intrinsic matching to ensure this fixed-point multiplication support doesn't break/bitrot.

anijain2305

Thanks for this work. Overall looks ok. There are few follow-ups.

We also need a test for the new Relay op as @tmoreau89 suggested

include/tvm/relay/attrs/transform.h

include/tvm/tir/op.h

include/tvm/tir/builtin.h

include/tvm/tir/op.h

src/relay/op/tensor/unary.cc

src/relay/qnn/op/requantize.cc

python/tvm/tir/op.py

topi/python/topi/arm_cpu/injective.py

topi/python/topi/arm_cpu/tensor_intrin.py

topi/python/topi/math.py

giuseros · 2020-07-14T18:04:13Z

@anijain2305 , @tmoreau89 , @kparzysz-quic I tried to address/reply to your comments. I added a test_fixed_point_multiply() in test_op_level3.py to test the intrinsic (which receives a lot of additional testing since it will be used in every qnn::requantize operation).

giuseros · 2020-07-15T11:22:26Z

Quick update on this.

I had to remove checks for input data type to be int32 from the intrinsic (and TOPI). Indeed, this can be called from relay/quantize/realize.cc with int64 and int8 input datatypes. After all, we convert upfront to int64 so the input data type shouldn't matter.
I also applied some changes to the intrinsic so that now the casts are more explicit.

Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4

Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909

Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407

Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd

Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e

Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979

Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02

Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366

Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c

Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a

anijain2305

Minor suggestions. Close to be merged

include/tvm/tir/builtin.h

topi/python/topi/math.py

anijain2305 · 2020-07-16T23:35:06Z

@tmoreau89 @kparzysz-quic Please review again and approve explicitly.

tmoreau89 · 2020-07-17T00:02:04Z

@giuseros thanks for adding the test case, and thank you @anijain2305 for the thorough review. I suggest to follow through with the naming tweak and this PR should be good to go.

Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad

giuseros · 2020-07-17T11:56:02Z

Hi @anijain2305 , @tmoreau89 , @kparzysz-quic
I addressed the name change, reformatted the help text in math.py and addressed an integration change in the arm intrinsic (q.val -> q.value).

Thank you all for your comments and your time!

kparzysz-quic

Looks good to me.

tmoreau89

Thank you @giuseros the changes LGTM

tmoreau89 · 2020-07-17T15:52:01Z

@anijain2305 please let us know if your requested changes have been addressed! thanks!

anijain2305

LGTM

anijain2305 · 2020-07-17T16:15:17Z

Thanks @giuseros @tmoreau89 @kparzysz-quic This is merged!

* Fixed point multiplication improvements for AArch64 Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4 * Fix python linting errors Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909 * Fix doxygen errors Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407 * Fix arm_cpu injective tests Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd * Fix python linting errors - 2 Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e * Fix arm_cpu injective tests - 2 Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979 * Redesign: introduce a qmuls (q-multiply and shift) general intrinsic Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02 * Fix python linting errors - 3 Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366 * Addressing review comments Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c * Fixing test failures Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a * Renaming qmuls to q_multiply_shift Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad

giuseros requested review from Huyuwei and Laurawly as code owners July 2, 2020 09:09

giuseros force-pushed the fixed_point_multiply branch 3 times, most recently from bb787cd to 3b238a6 Compare July 9, 2020 16:27

kparzysz-quic reviewed Jul 13, 2020

View reviewed changes

include/tvm/tir/op.h Outdated Show resolved Hide resolved

python/tvm/tir/op.py Outdated Show resolved Hide resolved

src/target/intrin_rule.cc Outdated Show resolved Hide resolved

tqchen added the status: need review label Jul 13, 2020

tqchen assigned anijain2305 Jul 13, 2020

tmoreau89 reviewed Jul 13, 2020

View reviewed changes

anijain2305 requested changes Jul 13, 2020

View reviewed changes

Giuseppe Rossini added 10 commits July 15, 2020 16:35

Fixed point multiplication improvements for AArch64

2d2a5f5

Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4

Fix python linting errors

c4b25c3

Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909

Fix doxygen errors

210841e

Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407

Fix arm_cpu injective tests

1e95833

Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd

Fix python linting errors - 2

a063a0a

Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e

Fix arm_cpu injective tests - 2

a26b432

Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979

Redesign: introduce a qmuls (q-multiply and shift) general intrinsic

070234d

Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02

Fix python linting errors - 3

0c7a010

Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366

Addressing review comments

e8730eb

Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c

Fixing test failures

d18e2fb

Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a

anijain2305 reviewed Jul 16, 2020

View reviewed changes

include/tvm/tir/builtin.h Outdated Show resolved Hide resolved

topi/python/topi/math.py Outdated Show resolved Hide resolved

Renaming qmuls to q_multiply_shift

a257327

Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad

giuseros force-pushed the fixed_point_multiply branch from 0424ea8 to a257327 Compare July 17, 2020 11:52

kparzysz-quic approved these changes Jul 17, 2020

View reviewed changes

tmoreau89 approved these changes Jul 17, 2020

View reviewed changes

anijain2305 reviewed Jul 17, 2020

View reviewed changes

anijain2305 approved these changes Jul 17, 2020

View reviewed changes

anijain2305 merged commit ccacb1e into apache:master Jul 17, 2020

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

ibsidorenko mentioned this pull request Aug 31, 2022

[Hexagon] Implement fixed_point_multiply op through intrinsics. #12659

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed point multiplication improvements for AArch64 #5980

Fixed point multiplication improvements for AArch64 #5980

giuseros commented Jul 2, 2020 •

edited

Loading

giuseros commented Jul 2, 2020

giuseros commented Jul 8, 2020

giuseros commented Jul 13, 2020

tmoreau89 left a comment

anijain2305 left a comment •

edited

Loading

giuseros commented Jul 14, 2020

giuseros commented Jul 15, 2020

anijain2305 left a comment

anijain2305 commented Jul 16, 2020

tmoreau89 commented Jul 17, 2020

giuseros commented Jul 17, 2020

kparzysz-quic left a comment

tmoreau89 left a comment

tmoreau89 commented Jul 17, 2020

anijain2305 left a comment

anijain2305 commented Jul 17, 2020

Fixed point multiplication improvements for AArch64 #5980

Fixed point multiplication improvements for AArch64 #5980

Conversation

giuseros commented Jul 2, 2020 • edited Loading

RFC

High level description of the submission

giuseros commented Jul 2, 2020

giuseros commented Jul 8, 2020

giuseros commented Jul 13, 2020

tmoreau89 left a comment

Choose a reason for hiding this comment

anijain2305 left a comment • edited Loading

Choose a reason for hiding this comment

giuseros commented Jul 14, 2020

giuseros commented Jul 15, 2020

anijain2305 left a comment

Choose a reason for hiding this comment

anijain2305 commented Jul 16, 2020

tmoreau89 commented Jul 17, 2020

giuseros commented Jul 17, 2020

kparzysz-quic left a comment

Choose a reason for hiding this comment

tmoreau89 left a comment

Choose a reason for hiding this comment

tmoreau89 commented Jul 17, 2020

anijain2305 left a comment

Choose a reason for hiding this comment

anijain2305 commented Jul 17, 2020

giuseros commented Jul 2, 2020 •

edited

Loading

anijain2305 left a comment •

edited

Loading