Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[BUGFIX] Fix accuracy issue due to reuse of primitives for MKLDNN-AArch64. Fixes #20265. #20482

Merged
merged 1 commit into from
Sep 8, 2021

Conversation

cfRod
Copy link

@cfRod cfRod commented Aug 3, 2021

Description

This fix is a workaround for the accuracy issue in the grouped convolution unit test observed when MXNet is built with Compute Library for Arm Architecture (ACL).
Fixes #20265
This change includes:

  • Updating MXNet's existing AddSign function to generate unique hashes for MKLDNN-ACL backend.
  • Adds a new AddSign overloaded function to generate hashes based on uint64_t type.
  • Adding DNNL_AARCH64_USE_ACL definition to CMakeLists.txt
  • Adding Crefeda Rodrigues to the contributors list

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Updating MXNet's existing AddSign function to generate unique hashes for MKLDNN-ACL backend.
  • Adds a new AddSign overloaded function to generate hashes based on uint64_t type.
  • Adding DNNL_AARCH64_USE_ACL definition to CMakeLists.txt
  • Adding Crefeda Rodrigues to the contributors list

Comments

  • This change is a workaround to resolve the accuracy issue seen test_operator.test_convolution_grouping until future releases of Compute library are compatible with reuse of MKLDNN/oneDNN primitives at the framework-level.

@cfRod cfRod requested a review from szha as a code owner August 3, 2021 07:40
@mxnet-bot
Copy link

Hey @cfRod , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [miscellaneous, website, sanity, windows-cpu, unix-cpu, centos-gpu, unix-gpu, clang, centos-cpu, windows-gpu, edge]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added the pr-awaiting-testing PR is reviewed and waiting CI build and test label Aug 3, 2021
@cfRod cfRod changed the title [BUGFIX] Fix reuse of primitives for MKLDNN-AArch64. Fixes #20265. [BUGFIX] Fix accuracy issue due to reuse of primitives for MKLDNN-AArch64. Fixes #20265. Aug 3, 2021
@mseth10 mseth10 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Aug 3, 2021
@cfRod
Copy link
Author

cfRod commented Aug 5, 2021

Hi,

I will be away until 20th August. Any issues with this PR I will address when I am back. Thanks.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Aug 5, 2021
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Sep 6, 2021
@cfRod
Copy link
Author

cfRod commented Sep 7, 2021

Hi,
I see that the linting test fails. Which linting tool should I use to test this locally?

This fix is a workaround for the accuracy issue observed when MXNet is built with Compute Library (ACL).
This change includes:
* Updating MXNet's AddSign function to generate unique hashes for MKLDNN-ACL backend.
* Adding DNNL_AARCH64_USE_ACL to CMakeLists.txt
* Adding Crefeda Rodrigues to the contributors list

Signed-off-by: cfRod <crefeda.rodrigues@arm.com>
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Sep 7, 2021
@szha szha merged commit 197671b into apache:v1.x Sep 8, 2021
@szha
Copy link
Member

szha commented Sep 8, 2021

@cfRod thanks for the workaround!

josephevans pushed a commit to josephevans/mxnet that referenced this pull request Feb 28, 2022
…. (apache#20482)

This fix is a workaround for the accuracy issue observed when MXNet is built with Compute Library (ACL).
This change includes:
* Updating MXNet's AddSign function to generate unique hashes for MKLDNN-ACL backend.
* Adding DNNL_AARCH64_USE_ACL to CMakeLists.txt
* Adding Crefeda Rodrigues to the contributors list

Signed-off-by: cfRod <crefeda.rodrigues@arm.com>
josephevans added a commit that referenced this pull request Mar 1, 2022
…20482) (#20921)

This fix is a workaround for the accuracy issue observed when MXNet is built with Compute Library (ACL).
This change includes:
* Updating MXNet's AddSign function to generate unique hashes for MKLDNN-ACL backend.
* Adding DNNL_AARCH64_USE_ACL to CMakeLists.txt
* Adding Crefeda Rodrigues to the contributors list

Signed-off-by: cfRod <crefeda.rodrigues@arm.com>

Co-authored-by: Crefeda Rodrigues <65665931+cfRod@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants