-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed point multiplication improvements for AArch64 #5980
Conversation
Following the discussion in the RFC I redesigned the submission by introducing a general
The structure of the submission remains the same. |
bb787cd
to
3b238a6
Compare
@anijain2305 @kparzysz-quic @tqchen Any update on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution @giuseros, overall the changes look good. It would be great to add one or two unit tests on the tensor intrinsic matching to ensure this fixed-point multiplication support doesn't break/bitrot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work. Overall looks ok. There are few follow-ups.
We also need a test for the new Relay op as @tmoreau89 suggested
@anijain2305 , @tmoreau89 , @kparzysz-quic I tried to address/reply to your comments. I added a |
Quick update on this.
|
Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4
Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909
Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407
Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd
Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e
Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979
Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02
Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366
Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c
Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions. Close to be merged
@tmoreau89 @kparzysz-quic Please review again and approve explicitly. |
@giuseros thanks for adding the test case, and thank you @anijain2305 for the thorough review. I suggest to follow through with the naming tweak and this PR should be good to go. |
Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
0424ea8
to
a257327
Compare
Hi @anijain2305 , @tmoreau89 , @kparzysz-quic Thank you all for your comments and your time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @giuseros the changes LGTM
@anijain2305 please let us know if your requested changes have been addressed! thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @giuseros @tmoreau89 @kparzysz-quic This is merged! |
* Fixed point multiplication improvements for AArch64 Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4 * Fix python linting errors Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909 * Fix doxygen errors Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407 * Fix arm_cpu injective tests Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd * Fix python linting errors - 2 Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e * Fix arm_cpu injective tests - 2 Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979 * Redesign: introduce a qmuls (q-multiply and shift) general intrinsic Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02 * Fix python linting errors - 3 Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366 * Addressing review comments Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c * Fixing test failures Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a * Renaming qmuls to q_multiply_shift Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
* Fixed point multiplication improvements for AArch64 Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4 * Fix python linting errors Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909 * Fix doxygen errors Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407 * Fix arm_cpu injective tests Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd * Fix python linting errors - 2 Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e * Fix arm_cpu injective tests - 2 Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979 * Redesign: introduce a qmuls (q-multiply and shift) general intrinsic Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02 * Fix python linting errors - 3 Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366 * Addressing review comments Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c * Fixing test failures Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a * Renaming qmuls to q_multiply_shift Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
* Fixed point multiplication improvements for AArch64 Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4 * Fix python linting errors Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909 * Fix doxygen errors Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407 * Fix arm_cpu injective tests Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd * Fix python linting errors - 2 Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e * Fix arm_cpu injective tests - 2 Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979 * Redesign: introduce a qmuls (q-multiply and shift) general intrinsic Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02 * Fix python linting errors - 3 Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366 * Addressing review comments Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c * Fixing test failures Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a * Renaming qmuls to q_multiply_shift Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
* Fixed point multiplication improvements for AArch64 Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4 * Fix python linting errors Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909 * Fix doxygen errors Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407 * Fix arm_cpu injective tests Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd * Fix python linting errors - 2 Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e * Fix arm_cpu injective tests - 2 Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979 * Redesign: introduce a qmuls (q-multiply and shift) general intrinsic Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02 * Fix python linting errors - 3 Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366 * Addressing review comments Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c * Fixing test failures Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a * Renaming qmuls to q_multiply_shift Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
RFC
This PR is based on the following RFC: https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm
High level description of the submission
The idea is to create a TIR intrinsic
fixed_point_multiply
that can be overloaded (by the means oftvm.target.intrin.register_intrin_rule
) and use AArch64 intrinsics (other vendors can provide their own implementation)The TIR default intrinsic is registered in:
tvm/src/target/intrin_rule.cc
The overload for AArch64 lives in
tvm/topi/python/topi/arm_cpu/tensor_intrin.py