-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Improve stack operator performance by oneDNN #20621
Conversation
Hey @bgawrych , Thanks for submitting the PR
CI supported jobs: [centos-cpu, windows-gpu, sanity, windows-cpu, centos-gpu, unix-cpu, website, edge, clang, miscellaneous, unix-gpu] Note: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change all the MKLDNN nomenclature to DNNL.
@mxnet-bot run ci [centos-cpu, unix-cpu] |
Jenkins CI successfully triggered : [unix-cpu, centos-cpu] |
@mxnet-bot run ci [centos-cpu, unix-cpu] |
Jenkins CI successfully triggered : [unix-cpu, centos-cpu] |
Jenkins CI successfully triggered : [unix-cpu] |
7caa1f2
to
215b79b
Compare
@mxnet-bot run ci [centos-gpu, miscellaneous, unix-cpu, unix-gpu, website, windows-gpu] |
Jenkins CI successfully triggered : [website, windows-gpu, centos-gpu, unix-gpu, unix-cpu, miscellaneous] |
@mxnet-bot run ci [unix-cpu, windows-gpu] |
Jenkins CI successfully triggered : [windows-gpu, unix-cpu] |
@szha Could you help with the merge, thanks! |
Description
Improves performance of stack operation. Performance results shows significant speedup on axis=0 (up to 7x faster).
Performance results collected on CLX8280 with
KMP_AFFINITY=granularity=fine,noduplicates,compact,1,0 OMP_NUM_THREADS=28 numactl --physcpubind=0-27 --membind=0
: