Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

[v1.9.x] Enable oneDNN BRGEMM kernel for FullyConnected #20591

Closed
wants to merge 3 commits into from

Conversation

bgawrych
Copy link
Contributor

Description

This PR backport changes from v1.x branch (#20533, #20450) and enables MXNet to use BRGEMM implementation of oneDNN inner_product (FullyConnected).
For large FullyConnected layers, like in BERT-Large, this implementation can speedup whole model execution for about 40%.

Relative results from EC2 M6i instance:
image

This change only makes sense if oneDNN will be updated (#20590)

bgawrych added 3 commits September 17, 2021 07:49
…f FC (apache#20450)

* Add flag for disabling oneDNN BRGEMM implementation of FC

* Review fixes

* Update env_var.md
* Enable brgemm based on input info

* fix sanity

* Review fixes

* Change function name

* Fix typo

* Align variable assignments

* Fix review

* use const reference
@mxnet-bot
Copy link

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, centos-gpu, windows-gpu, clang, unix-cpu, miscellaneous, website, centos-cpu, sanity, windows-cpu, unix-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Sep 17, 2021
@bgawrych bgawrych closed this Nov 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants