[Feature] Add oneDNN support for interleaved_matmul_selfatt_* operators (fp32/int8) #20163

bgawrych · 2021-04-14T08:58:49Z

Description

This change adds oneDNN support for two operators:

_contrib_interleaved_matmul_selfatt_qk
_contrib_interleaved_matmul_selfatt_valatt

Both operators will be used when backend MKLDNN/MKLDNN_QUANTIZE will be chosen - there is no change in terms of performance between MKL fp32 vs. oneDNN fp32, but the main advantage is utilizing int8 data type

10 iterations of BERT-Large (gluon-nlp v0.10.x) [Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz]:

MKL implementation (fp32 as int8 is not supported):

oneDNN implementation (int8):

_contrib_interleaved_matmul_selfatt_qk => _sg_mkldnn_selfatt_qk
_contrib_interleaved_matmul_selfatt_qk => _sg_mkldnn_selfatt_valatt

We can observe that this change positively influenced other operators as there is less dequantization/quantization overhead and memory reorders

Great contribution of @grygielski to this change

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage

mxnet-bot · 2021-04-14T08:58:53Z

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [unix-cpu, clang, unix-gpu, windows-cpu, centos-gpu, website, centos-cpu, sanity, miscellaneous, windows-gpu, edge]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

bgawrych · 2021-04-15T13:13:30Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-04-15T13:13:37Z

Jenkins CI successfully triggered : [windows-cpu]

sfraczek

lgtm

bgawrych · 2021-04-20T06:19:33Z

@szha Can we merge it?

szha · 2021-04-20T13:23:07Z

@bgawrych thanks for the contribution

grygielski and others added 11 commits April 14, 2021 10:37

Add oneDNN code to interleved kernels

ffbf43a

check

b46a7a5

Fix selfattQK subgraph

f53e084

fix qk

b585494

Fixes QK

2f5a7e7

add test for oneDNN self_att qk

e8c9882

basic valatt

a0764eb

add valatt test

94ad11f

refactor valatt

1454616

fix review

d1ba73a

Change param struct name

1468c86

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 14, 2021

Fix sanity

ba04ec8

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 15, 2021

Fix sanity

aecb845

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 15, 2021

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 15, 2021

lanking520 added the pr-awaiting-review PR is waiting for code review label Apr 15, 2021

$sfraczek$

sfraczek approved these changes Apr 19, 2021

View reviewed changes

szha merged commit 16d1da9 into apache:v1.x Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add oneDNN support for interleaved_matmul_selfatt_* operators (fp32/int8) #20163

[Feature] Add oneDNN support for interleaved_matmul_selfatt_* operators (fp32/int8) #20163

bgawrych commented Apr 14, 2021

mxnet-bot commented Apr 14, 2021

bgawrych commented Apr 15, 2021

mxnet-bot commented Apr 15, 2021

$@sfraczek$ sfraczek left a comment

bgawrych commented Apr 20, 2021

szha commented Apr 20, 2021

[Feature] Add oneDNN support for interleaved_matmul_selfatt_* operators (fp32/int8) #20163

[Feature] Add oneDNN support for interleaved_matmul_selfatt_* operators (fp32/int8) #20163

Conversation

bgawrych commented Apr 14, 2021

Description

Checklist

Essentials

mxnet-bot commented Apr 14, 2021

bgawrych commented Apr 15, 2021

mxnet-bot commented Apr 15, 2021

sfraczek left a comment

Choose a reason for hiding this comment

bgawrych commented Apr 20, 2021

szha commented Apr 20, 2021

$@sfraczek$ sfraczek left a comment