Refactor operators and add MKLDNN #9677

zheng-da · 2018-02-02T00:59:14Z

Description

This is to add refactored operators and MKLDNN.

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

marcoabreu · 2018-02-02T01:09:17Z

prepare_mkldnn.sh

+
+# set -ex
+#
+# All modification made by Intel Corporation: © 2016 Intel Corporation


Check license (non-blocking, but please create an issue as reminder)

marcoabreu · 2018-02-02T01:16:10Z

tests/ci_build/ci_build.sh

@@ -178,6 +178,7 @@ ${DOCKER_BINARY} run --rm --pid=host \
    -e "CI_BUILD_GID=$(id -g)" \
    -e "CUDA_ARCH=-gencode arch=compute_52,code=[sm_52,compute_52] --fatbin-options -compress-all" \
    -e "MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0" \
+    -e "ARCH_OPT=-mavx2" \


Please make this an environment var of the build_cuda dockerfile as otherwise, this will break the build on ARM-based devices

marcoabreu · 2018-02-02T01:16:56Z

tests/cpp/include/test_core_op.h

+        } else if (req.type == ResourceRequest::kParallelRandom) {
+          Resource rm = ResourceManager::Get()->Request(ctx->run_ctx.ctx, req);
+          if (ctx->run_ctx.ctx.dev_mask() == Context::kCPU) {
+            common::random::RandGenerator<cpu, DType>::AllocState(


Is there a possibility to define a pre-set seed?

I don't know. I got the code from Chris.

@cjolivier01 could you help comment for pre-set seed?

you define the seed just like you’d define it in the normal code — there’s a C API for it

marcoabreu · 2018-02-02T01:17:40Z

tests/cpp/operator/batchnorm_test.cc

 #include "executor/exec_pass.h"

+#if 0


Re-enable tests

Use NNVM interface for upsampling. Use NNVM interface for convolution. Use NNVM interface for deconvolution. Use NNVM interface for FullyConnected. Move NNVM interface to batch norm. Use NNVM interface for depthwise convolution. Use NNVM interface for softmax activation. Use NNVM interface for pooling. use NNVM interface for dropout. Use NNVM interface for activation. Use NNVM interface for CuDNN batch norm. Use NNVM interface for CuDNN pooling. Use NNVM interface for CuDNN softmax activation. Use NNVM interface for CuDNN activation. Use NNVM interface for CuDNN convolution. Use NNVM interface for CuDNN deconvolution. Move concat to nn/ Use NNVM interface for concat. Fix headers in concat. Move lrn to nn/. Use NNVM interface for LRN. Fix a compilation error in convolution. Fix a compilation error in activation. Fix coding style. Fix coding style for make lint. use enums in batch norm. Use CoreOpRunner for refactored Ops. Make FullyConnected stateless. Make upsampling stateless. Make pooling stateless. Make batchnorm stateless. Make SoftmaxActivation stateless. Fix a code style problem. pass amalgamation test for batch norm. pass amalgamation test for dropout. Get convolution ops from a function. Fix compilation errors for GPU. Fix thread local in diff platforms. Avoid using thread_local for non-CuDNN conv/deconv. Remove TODO in deconv. Fix a bug in batch norm. Fix a bug in fully connected. Don't set #inputs for backward convolution. Revert "Make pooling stateless."

Update MXNet for MKLDNN. Enable MKLDNN Relu. Fix a compilation error. Change Makefile for MKLDNN. Remove infer storage in convolution. Update MXNet for MKLDNN. Support MKLDNN storage type in python. Update activation. Add MKLDNN base classes. Implement MKLDNN fully connected. Add MKLDNN convolution. Update MKLDNN interface in NDArray. MKLDNN convolution handle CreateMKLDNNData failure. Add another GetMKLDNNData in NDArray. Have mkldnn to define the data format. Create output MKLDNN memory explicitly for FC. Fix a bug in NDArray. Fix a bug in GetWeightDesc. Convert data layout if necessary in FC. remove unnecessary print in MKLDNN convolution. Add MKLDNN deconvolution. Add MKLDNNStream to manage primitives and memories. Use MKLDNNStream to register memory in NDArray. Use MKLDNNStream to manage resources in operators. Handle kAddTo in MKLDNN operators. Fix a bug in deconvolution. Fix bugs in NDArray. Revert "Fix bugs in NDArray." This reverts commit f5624a4. Fix a bug in NDArray. Fix a bug in NDArray. Reorder MKLDNN memory to default format in SetTBlob. Disable MKLDNN correctly. Fix a bug in activation. Reshape of NDArray supports MKLDNN. Fix a memory ref bug in NDArray. Reshape NDArray in MKLDNN FullyConnected. Fix data format conversion. Create MKLDNN NDArray in python. Support Slice for MKLDNN NDArray. Reduce the overhead of summing the result to the output array. Avoid unnecessary memory copy in NDArray. Fix a bug in data reordering. Fix a bug in NDArray. Don't hard code MKLDNN type. Support dilation in MKLDNN convolution. Fix a bug in sum results. Rewrite GetMKLDNNData. Add prepare_mkldnn.sh Enable MKLDNN activation. Fix a bug on FullyConnected. Handle 3 dims for MKLDNN NDArray. Fix a bug in MKLDNN FC. Support MKLDNN storage in KV store. Fix a bug in executor for non-default NDArray. Fix a link error in cast_storage.cc. Remove unnecessary function def Fall back to def storage if the type isn't supported by MKLDNN. Use NDArray for MKLDNN in python. Reshape output of MKLDNN convolution. Fix a bug in NDArray. Support more operations in MKLDNN NDArray. Fix a bug in deconvolution. Fix bugs in MKLDNN deconvolution. We still need to compute bias correctly. Have elemwise binary ops to fall to default for MKLDNN. Limit the cases that MKLDNN operations are called. Force the layout of mkldnn::memory from NDArray. Add MKLDNN softmax. Fix output storage type of MKLDNN softmax. Add MKLDNN sum. Fix a bug in elemwise sum. Fix a bug in MKLDNN softmax. Fix a bug in imperative. Clean up dispatch modes. Remove redundant code. MKLDNN Pooling Op integration MKLDNN Pooling Op integration add missing file fix mkldnn pooling op workspace issue handle workspace in MKLDNN pooling correctly. Use a non-MKLDNN op for testing. Allow to share arguments and their gradients between executors. Avoid using MKLDNN pooling when it's not supported. Support MKLDNN properly. Choose MKLDNN softmax more carefully. Fix a bug in MKLDNN pooling. Fall back if MKLDNN pooling isn't supported. Fix a bug in Slice of NDArray. Use int32 for workspace memory. Exclude MKLDNN act with tanh. Have two Reshape functions in NDArray. Copy data for NDArray with diff shapes. Add MKLDNN copy. Add MKLDNN version of elemwise_add. Add MKLDNN version of Flatten. add mkldnn surport for concat simplify MKLDNN Flatten. Enalbe MKLDNN deconvolution with bias. Fix a bug in CuDNN deconvolution. avoid using MKLDNNStorage when it's not defined. Remove ./cudnn_lrn-inl.h Fix for make lint. add mkldnn surport for concat fix the coding style for pr of mkldnn concat Only add input data for MKLDNN concat backward Remove unnecessary TODO. remove unnecessary __repr__ in MKLNDArray. better condition check for readability. Use macro when including mkldnn.hpp. Revert "Use CoreOpRunner for refactored Ops." This reverts commit a28586f. Fix a bug in test core. Limit MKLDNN ops being used. Fix complains from "make pylint" Move ContainStorage to common/utils.h Limit MKLDNN concat being used. Add license. Fix amalgamation Fix compilation error in mkldnn_ops-inl.h Fix a bug in deconvolution. Fix a bug in pooling. MKLDNN ops allocates temp mem. Fix a bug in pooling. Allocate align memory from temp space. Have parameter gradients stored in the default storage. Handle all cases in CopyFrom. Ensure NDArray returns memory with right memory descriptors. use auto to define memory in the operator. Use raw pointer for mkldnn memory. Move more code to mkldnn_base.cc Fix a compilation error. Address review comments. fix a bug in activation backward. Miss a macro in mkldnn_base.cc Fix a bug in data iterator in examples. Avoid memory allocation in ReshapeMKLDNN. Avoid memory allocation in storage cast. Fix a bug in cast storage. Handle sliced MKLDNN NDArray. Use memcpy if NDArray uses default format. Revert "Limit MKLDNN ops being used." This reverts commit 75e2ae5. Enable mkldnn act backward has the same input layout. Fix a bug in mkldnn activation. Use MKLDNN sum in more cases. Improve perf of reorder. Avoid memory reorder in conv and deconv. Avoid unnecessary storage cast in fallback path. Revert "Use MKLDNN sum in more cases." This reverts commit 7a21ebc. Handle sliced ndarray in more cases. Fix a complain from make lint. Update Jenkins to test MKLDNN. debug compiling mkldnn. Use MKLDNN sum in more cases. Add mkldnn as a submodule. Compile with mkldnn in 3rdparty. Fix some coding styles. write the path to mkldnn lib in libmxnet.so. use rpath with $ORIGIN. Pack all lib files in Jenkins. pack and unpack mxnet with MKLDNN. Update Jenkinsfile Update Jenkinsfile Add mkldnn batch normalization Fix bugs in BN. Avoid memory allocation in MKLDNNCopy. only use MKLDNN BatchNorm for special cases. MKLDNN BatchNorm doesn't work well on the default layout. Add MKL-DNN based LRN Code Style Changes Fix a bug in BN. Fix a bug in LRN. Handle non-default storage in memory plan. Fix coding style. Fix a compilation error without mkldnn. Fix some coding styles for batch norm Improve forward of convolution. Add openmp and simd support to BN operator Retrieve MKLDNN Conv primitive based on signature. Retrieve Act primitive based on its signature. Fix a bug in pooling. Diable some MKLDNN activation and pooling. Cast MKLDNN storage with diff data type. Check if it's a view of NDArray. Reshaped and sliced arrays share the same chunks. Implement caching MKLDNN Act correctly. Fix a bug in check_consistency. Fix a potential bug when destroying NDArray. Fix bugs when allocating mem in NDArray. Fix coding style. Add micro when using mkldnn in ndarray. Fix a compilation error. Fix a bug in concat. Remove MKLDNNStorage. handle diff layouts in CopyFromToDnsImpl. Fallback correctly. Force weight grad to use default layout. Reorder weight arrays in (de)conv for faster inference. Avoid caching TBlob from NDArray. This commit may add some overhead of managing NDArray for each fallback. Fix a bug in Flatten. handle ndarray with def layout in mkldnn BN correctly. Align to page when mkldnn is enabled. Use default mem alloc for mkldnn. Reuse NDArrays. Support WriteInplace for sum. fix complains from "make lint". Avoid reallocation in NDArray. Handle weight arrays with special MKLDNN layouts. Remove unnecessary GetWeights. Fix compilation error without MKLDNN. Fix a bug in (de)conv for weight arrays. Fix a minor bug in MKLDNN conv. Fix a bug in MKLDNNOpSignature. Reimplement fallback for MKLDNN ops. Fix a bug in FallbackExecutor. Add params in hashcode. Invalidate data in outputs to accelerate. Fix a minor bug. Update mkldnn_base-inl.h Add primitive caching for Pooling forward computation Add hashcode in pooling parameters. Support NDArray copy with types unsupported by MKLDNN. Avoid using MKLDNN concat for negative dimension. Fix make lint complain. Disable mkldnn avg pooling for now. Fix a compile warning. Fix compile error when MKLDNN is disabled. OP primitive cache: use memory as signature for MKLDNN storage type Remove MKLDNN array in python. Disable Clang tests in Jenkins. Use mklml dockers to test mkldnn. Update MKLDNN repo to zhengda's mkldnn repo. Update MKLDNN repo to ashok's. Fix a bug in fallback. Change avg pooling algorithm to pooling_avg_include_padding Fix a code style in mkldnn pooling. Temp fix a bug in FC. Revert "Disable Clang tests in Jenkins." This reverts commit b4efa8f. Rebase and Refactor deconv (#20) * rebase to Da,Zheng refactor branch Jan.14, add signature for mkldnn Deconv and modify classMKLDNNDeconvForward * fix make lint complains A simple way of caching BN inference. cache BN forward for both training and inference. Fix some minor problems in BN. Fix a bug in caching BN. force to build with avx2 in Jenkins. Remove the remaining MKLDNNStorageType Some minor updates in NDArray. a lot of updates to address comments. minor changes.

* LRN coding style change * Add const for local variables * Add req for LRN forward * rebase code * align API interface * revert modification in test_executor.

larroy · 2018-02-15T19:21:16Z

src/common/exec_utils.h

-        (*idx_map)[i] = temp_dst->size();
-      }
-      NDArray temp(nd.shape(), nd.ctx(), false, nd.dtype());
+    bool is_default = nd.storage_type() == kDefaultStorage;


CHECK / assert that idx_map is not null is missing.

marcoabreu · 2018-02-15T19:22:10Z

I've talked offline to @piiswrong and he stated that the disabled tests are covered by Python tests. I didn't verify that statement, but I agree that we should merge this PR to have it in master and thus allow broad testing.

-0.9 from my side. If everybody else approves (especially @cjolivier01), I won't stand in the way. But I'd like to see my comments addressed as soon as this PR has been merged.

piiswrong · 2018-02-15T19:22:25Z

Arguments for merging this PR despite the disabled cpp tests:

Python (and other language) tests passes. The frontend tests are the primary indicator for code quality. This suggests the code is in decent shape.
cpp tests hasn't been run in CI for a year and we have been just fine.
The cpp tests are failing in arcane ways that know one knows how to fix. Even their creator @cjolivier01
We need this to get in soon so that people can try it before 1.2 release. This is far better way of catching bugs than the cpp tests.

cjolivier01 · 2018-02-15T19:25:36Z

There's only two out of 10-or-15-or-so tests that are disabled.
I am fine with the merge from the point of the cpp tests. Although it's possible there is something wrong, it's not clear if it's the test or the main code, so I agree with @piiswrong that if it's really broken, we will find out soon enough after it is merged.
Both tests are disabled for the same behavior.

marcoabreu · 2018-02-15T19:55:48Z

To me, it really sounds like there are quite a decent number of known issues but for whatever reason, there's a push to get this merged. Honestely, I'm afraid of this being PR being merged and disabled tests and features being forgotten and thus increasing technical debt, which in the end is going to harm our users. Especially if we are aware that for example operator tuning is broken on Windows, it's IMO not the right thing to just disable the test or the entire feature in the default CMake config. If a user wants to test the feature, it will hang for them and we will probably have forgotten about it.

cjolivier01 · 2018-02-15T19:56:25Z

" features being forgotten"? what features?

cjolivier01 · 2018-02-15T19:56:58Z

Operator Tuning was never supported on Windows and has nothing to do with this PR.

marcoabreu · 2018-02-15T20:35:19Z

Where is it stated that operator tuning is not supported on Windows? Do users get a proper error message? Also, I don't see why we should not support a certain feature on a platform if there is no platform dependency.

According to http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/incubator-mxnet/branches/master/runs/384/nodes/433/steps/768/log/?start=0 we are in fact successfully running operator tuning on Windows.

Example output:

test_operator_gpu.test_conv ... [04:50:06] c:\jenkins_slave\workspace\build-gpu@3\src\operator\nn\cudnn\./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
[04:50:06] C:/jenkins_slave/workspace/build-gpu@3/src/operator/nn/convolution.cu:70: This convolution is not supported by cudnn, MXNET convolution is applied.
[04:50:06] C:/jenkins_slave/workspace/build-gpu@3/src/operator/nn/convolution.cu:70: This convolution is not supported by cudnn, MXNET convolution is applied.
ok

It is related to this PR to some degree since it disables a test.

cjolivier01 · 2018-02-15T21:42:47Z

What you're showing there is CUDNN tuning, which is something different.

cjolivier01 · 2018-02-15T21:45:07Z

The cpp unit tests aren't run ever on Windows. In fact, they can't be built on Windows at this time due to no static linking/dllexport issues, etc.
While I was refactoring the cpp unit tests, I noticed that it wasn't turned off like it should be, so I fixed that. The disabling of the test was not done because of any changes in this PR.

marcoabreu · 2018-02-15T21:49:14Z

Okay, thanks for elaborating. I think we should address this at a later point in time Independent of this PR.

iblislin · 2018-02-24T16:51:23Z

I did bisect. This patch caused MXNet.jl testing failure.

(Ref https://travis-ci.org/dmlc/MXNet.jl/jobs/345129890#L1486)

The failed case here:
https://github.com/dmlc/MXNet.jl/blob/4038704f1f8bd733bc2aab5205068dddbbf122a2/test/unittest/symbolic-node.jl#L140

iblislin · 2018-02-24T17:00:43Z

I built the python one.

In [6]: mx.sym.Convolution(attr={'a': 42})

and got similar error message.
This PR prevent user from attaching arbirary attr dictionary.
Is this change intended?

zheng-da · 2018-02-24T19:52:57Z

I guess the reason is that NNVM handles operator arguments differently?
@piiswrong @szha I see mx.sym.Convolution has an argument called attr, but attr doesn't exist in ConvolutionParam and isn't listed in the document. How was this argument handled originally?

marcoabreu · 2018-02-24T23:36:47Z

Could somebody please elaborate why that package is hosted in another repository and thus not part of our PR validation chain?

iblislin · 2018-02-25T03:38:31Z

@marcoabreu
About the reason of hosting Julia code in another repository,
Julia's package manager is built on top of git, and it requires some specific structure in the cloned dir.
For example, we have MXNet.jl, we cloned it as MXNet and it should have following dir structure:

MXNet/     # the pkg name
  |- src/
  |    |- MXNet.jl   # this file is the pkg entry point, should be named as same as pkg name
  |    |- other.jl ... etc
  |- test/
  |    |- runtests.jl  # test cases entry point

Since git cannot checkout a subdir as svn does, the only choice is to put Julia binding as a single repo.

and thus not part of our PR validation chain?

I tried to ping developer several times on both GitHub and Slack, but did not get progress.
See:

I finished the patch already, but I do not have any permission to trigger new Jenkins script via PR (#8175 (comment)).

I beg for help.

marcoabreu · 2018-02-26T13:00:14Z

I have commented under #8175. Let's keep this PR clean from that conversation.

reminisce · 2018-03-22T21:24:53Z

src/operator/nn/convolution.cc

+  else
+    return std::vector<std::string>{"data", "weight", "bias"};
+})
+.set_attr<nnvm::FInferShape>("FInferShape", ConvolutionShape)


Should have added FListOutputNames here to keep the backward compatibility. If not, monitor function's behavior might be different from before if it depends on the output names. In quantization, I use the output name to determine whether a layer's output needs to be collected for calibration. I will add this attribute in my quantization PR.

* Remove MKL code. * Integrate MKLDNN. Update MXNet for MKLDNN. Enable MKLDNN Relu. Fix a compilation error. Change Makefile for MKLDNN. Remove infer storage in convolution. Update MXNet for MKLDNN. Support MKLDNN storage type in python. Update activation. Add MKLDNN base classes. Implement MKLDNN fully connected. Add MKLDNN convolution. Update MKLDNN interface in NDArray. MKLDNN convolution handle CreateMKLDNNData failure. Add another GetMKLDNNData in NDArray. Have mkldnn to define the data format. Create output MKLDNN memory explicitly for FC. Fix a bug in NDArray. Fix a bug in GetWeightDesc. Convert data layout if necessary in FC. remove unnecessary print in MKLDNN convolution. Add MKLDNN deconvolution. Add MKLDNNStream to manage primitives and memories. Use MKLDNNStream to register memory in NDArray. Use MKLDNNStream to manage resources in operators. Handle kAddTo in MKLDNN operators. Fix a bug in deconvolution. Fix bugs in NDArray. Revert "Fix bugs in NDArray." This reverts commit f5624a4. Fix a bug in NDArray. Fix a bug in NDArray. Reorder MKLDNN memory to default format in SetTBlob. Disable MKLDNN correctly. Fix a bug in activation. Reshape of NDArray supports MKLDNN. Fix a memory ref bug in NDArray. Reshape NDArray in MKLDNN FullyConnected. Fix data format conversion. Create MKLDNN NDArray in python. Support Slice for MKLDNN NDArray. Reduce the overhead of summing the result to the output array. Avoid unnecessary memory copy in NDArray. Fix a bug in data reordering. Fix a bug in NDArray. Don't hard code MKLDNN type. Support dilation in MKLDNN convolution. Fix a bug in sum results. Rewrite GetMKLDNNData. Add prepare_mkldnn.sh Enable MKLDNN activation. Fix a bug on FullyConnected. Handle 3 dims for MKLDNN NDArray. Fix a bug in MKLDNN FC. Support MKLDNN storage in KV store. Fix a bug in executor for non-default NDArray. Fix a link error in cast_storage.cc. Remove unnecessary function def Fall back to def storage if the type isn't supported by MKLDNN. Use NDArray for MKLDNN in python. Reshape output of MKLDNN convolution. Fix a bug in NDArray. Support more operations in MKLDNN NDArray. Fix a bug in deconvolution. Fix bugs in MKLDNN deconvolution. We still need to compute bias correctly. Have elemwise binary ops to fall to default for MKLDNN. Limit the cases that MKLDNN operations are called. Force the layout of mkldnn::memory from NDArray. Add MKLDNN softmax. Fix output storage type of MKLDNN softmax. Add MKLDNN sum. Fix a bug in elemwise sum. Fix a bug in MKLDNN softmax. Fix a bug in imperative. Clean up dispatch modes. Remove redundant code. MKLDNN Pooling Op integration MKLDNN Pooling Op integration add missing file fix mkldnn pooling op workspace issue handle workspace in MKLDNN pooling correctly. Use a non-MKLDNN op for testing. Allow to share arguments and their gradients between executors. Avoid using MKLDNN pooling when it's not supported. Support MKLDNN properly. Choose MKLDNN softmax more carefully. Fix a bug in MKLDNN pooling. Fall back if MKLDNN pooling isn't supported. Fix a bug in Slice of NDArray. Use int32 for workspace memory. Exclude MKLDNN act with tanh. Have two Reshape functions in NDArray. Copy data for NDArray with diff shapes. Add MKLDNN copy. Add MKLDNN version of elemwise_add. Add MKLDNN version of Flatten. add mkldnn surport for concat simplify MKLDNN Flatten. Enalbe MKLDNN deconvolution with bias. Fix a bug in CuDNN deconvolution. avoid using MKLDNNStorage when it's not defined. Remove ./cudnn_lrn-inl.h Fix for make lint. add mkldnn surport for concat fix the coding style for pr of mkldnn concat Only add input data for MKLDNN concat backward Remove unnecessary TODO. remove unnecessary __repr__ in MKLNDArray. better condition check for readability. Use macro when including mkldnn.hpp. Revert "Use CoreOpRunner for refactored Ops." This reverts commit a28586f. Fix a bug in test core. Limit MKLDNN ops being used. Fix complains from "make pylint" Move ContainStorage to common/utils.h Limit MKLDNN concat being used. Add license. Fix amalgamation Fix compilation error in mkldnn_ops-inl.h Fix a bug in deconvolution. Fix a bug in pooling. MKLDNN ops allocates temp mem. Fix a bug in pooling. Allocate align memory from temp space. Have parameter gradients stored in the default storage. Handle all cases in CopyFrom. Ensure NDArray returns memory with right memory descriptors. use auto to define memory in the operator. Use raw pointer for mkldnn memory. Move more code to mkldnn_base.cc Fix a compilation error. Address review comments. fix a bug in activation backward. Miss a macro in mkldnn_base.cc Fix a bug in data iterator in examples. Avoid memory allocation in ReshapeMKLDNN. Avoid memory allocation in storage cast. Fix a bug in cast storage. Handle sliced MKLDNN NDArray. Use memcpy if NDArray uses default format. Revert "Limit MKLDNN ops being used." This reverts commit 75e2ae5. Enable mkldnn act backward has the same input layout. Fix a bug in mkldnn activation. Use MKLDNN sum in more cases. Improve perf of reorder. Avoid memory reorder in conv and deconv. Avoid unnecessary storage cast in fallback path. Revert "Use MKLDNN sum in more cases." This reverts commit 7a21ebc. Handle sliced ndarray in more cases. Fix a complain from make lint. Update Jenkins to test MKLDNN. debug compiling mkldnn. Use MKLDNN sum in more cases. Add mkldnn as a submodule. Compile with mkldnn in 3rdparty. Fix some coding styles. write the path to mkldnn lib in libmxnet.so. use rpath with $ORIGIN. Pack all lib files in Jenkins. pack and unpack mxnet with MKLDNN. Update Jenkinsfile Update Jenkinsfile Add mkldnn batch normalization Fix bugs in BN. Avoid memory allocation in MKLDNNCopy. only use MKLDNN BatchNorm for special cases. MKLDNN BatchNorm doesn't work well on the default layout. Add MKL-DNN based LRN Code Style Changes Fix a bug in BN. Fix a bug in LRN. Handle non-default storage in memory plan. Fix coding style. Fix a compilation error without mkldnn. Fix some coding styles for batch norm Improve forward of convolution. Add openmp and simd support to BN operator Retrieve MKLDNN Conv primitive based on signature. Retrieve Act primitive based on its signature. Fix a bug in pooling. Diable some MKLDNN activation and pooling. Cast MKLDNN storage with diff data type. Check if it's a view of NDArray. Reshaped and sliced arrays share the same chunks. Implement caching MKLDNN Act correctly. Fix a bug in check_consistency. Fix a potential bug when destroying NDArray. Fix bugs when allocating mem in NDArray. Fix coding style. Add micro when using mkldnn in ndarray. Fix a compilation error. Fix a bug in concat. Remove MKLDNNStorage. handle diff layouts in CopyFromToDnsImpl. Fallback correctly. Force weight grad to use default layout. Reorder weight arrays in (de)conv for faster inference. Avoid caching TBlob from NDArray. This commit may add some overhead of managing NDArray for each fallback. Fix a bug in Flatten. handle ndarray with def layout in mkldnn BN correctly. Align to page when mkldnn is enabled. Use default mem alloc for mkldnn. Reuse NDArrays. Support WriteInplace for sum. fix complains from "make lint". Avoid reallocation in NDArray. Handle weight arrays with special MKLDNN layouts. Remove unnecessary GetWeights. Fix compilation error without MKLDNN. Fix a bug in (de)conv for weight arrays. Fix a minor bug in MKLDNN conv. Fix a bug in MKLDNNOpSignature. Reimplement fallback for MKLDNN ops. Fix a bug in FallbackExecutor. Add params in hashcode. Invalidate data in outputs to accelerate. Fix a minor bug. Update mkldnn_base-inl.h Add primitive caching for Pooling forward computation Add hashcode in pooling parameters. Support NDArray copy with types unsupported by MKLDNN. Avoid using MKLDNN concat for negative dimension. Fix make lint complain. Disable mkldnn avg pooling for now. Fix a compile warning. Fix compile error when MKLDNN is disabled. OP primitive cache: use memory as signature for MKLDNN storage type Remove MKLDNN array in python. Disable Clang tests in Jenkins. Use mklml dockers to test mkldnn. Update MKLDNN repo to zhengda's mkldnn repo. Update MKLDNN repo to ashok's. Fix a bug in fallback. Change avg pooling algorithm to pooling_avg_include_padding Fix a code style in mkldnn pooling. Temp fix a bug in FC. Revert "Disable Clang tests in Jenkins." This reverts commit b4efa8f. Rebase and Refactor deconv (apache#20) * rebase to Da,Zheng refactor branch Jan.14, add signature for mkldnn Deconv and modify classMKLDNNDeconvForward * fix make lint complains A simple way of caching BN inference. cache BN forward for both training and inference. Fix some minor problems in BN. Fix a bug in caching BN. force to build with avx2 in Jenkins. Remove the remaining MKLDNNStorageType Some minor updates in NDArray. a lot of updates to address comments. minor changes. * Use NNVM interface. Use NNVM interface for upsampling. Use NNVM interface for convolution. Use NNVM interface for deconvolution. Use NNVM interface for FullyConnected. Move NNVM interface to batch norm. Use NNVM interface for depthwise convolution. Use NNVM interface for softmax activation. Use NNVM interface for pooling. use NNVM interface for dropout. Use NNVM interface for activation. Use NNVM interface for CuDNN batch norm. Use NNVM interface for CuDNN pooling. Use NNVM interface for CuDNN softmax activation. Use NNVM interface for CuDNN activation. Use NNVM interface for CuDNN convolution. Use NNVM interface for CuDNN deconvolution. Move concat to nn/ Use NNVM interface for concat. Fix headers in concat. Move lrn to nn/. Use NNVM interface for LRN. Fix a compilation error in convolution. Fix a compilation error in activation. Fix coding style. Fix coding style for make lint. use enums in batch norm. Use CoreOpRunner for refactored Ops. Make FullyConnected stateless. Make upsampling stateless. Make pooling stateless. Make batchnorm stateless. Make SoftmaxActivation stateless. Fix a code style problem. pass amalgamation test for batch norm. pass amalgamation test for dropout. Get convolution ops from a function. Fix compilation errors for GPU. Fix thread local in diff platforms. Avoid using thread_local for non-CuDNN conv/deconv. Remove TODO in deconv. Fix a bug in batch norm. Fix a bug in fully connected. Don't set #inputs for backward convolution. Revert "Make pooling stateless." * revert modification in test_executor. * Fix a bug in FlattenStorageType. * Remove BN debug. * Remove remaining MXNET_USE_MKL2017 * Remove unused code in pooling. * Fixing bugs in gtests. * Fix lint errors. * a lot of minor updates to address comments. * Fix coding style in MKLDNN Pooling (apache#22) * revert the code change in the previous code refactor. * Fix a bug in pooling. * LRN coding style changes (apache#21) * LRN coding style change * Add const for local variables * Add req for LRN forward * rebase code * align API interface * revert modification in test_executor. * cast storage with MKLDNN properly. * Minor updates to address comments. * some minor updates. * Switch to the master branch of MKLDNN. * Minor updates to address comments. * Update activation.cc * Fix a bug in convert NDArray. * Add gluon model zoo tests. * Update GPU tests on model zoo. * Avoid using mobilenet for GPU tests with gluon models. mobilenet can't pass the test even without MKLDNN. * Update GPU tests on gluon. * change cmake to compile MKLDNN. * update cmake for MKLDNN. * Implement align myself. * Switch to intel/mkl-dnn. * Fix errors in align unittest. * Add unit test for LRN. * fix a compilation error. * use storage_type_assign to determine storage type. * avoid global pooling in mkldnn. There is a bug in global pooling in mkldnn. * compare all MKLDNN ops with native impls. add MXNET_MKLDNN_DEBUG to control the test. * Fix a bug in testing correctness. * print the name of buggy operator. * undo some modifications. * Fix a bug on reshaped array. * avoid testing outputs with NullOp. * turn on MKLDNN tests in Jenkins. * print each operator in MKLDNN tests. * rename test_gluon_model_zoo.py * Create hashcode for operator parameters properly. * Add USE_MKL2017 back. * Print warning messages. * move batchnorm tests to nnvm interface. * Delete batchnorm v1 tests. * Get inputs and outputs in batchnorm tests. * disable batchnorm tests for now. * Fix GPU tests on gluon model zoo. * Fix lint complains in tests. * Remove simd from openmp instructions in BatchNorm (apache#24) * Remove warnings. * Fix MKLDNN 1st compile failure issue (apache#23) * Fix compilation errors. * Remove ARCH_OPT in Jenkins. * Revert "avoid global pooling in mkldnn." This reverts commit f6efd34. * Move to the latest MKLDNN. This fixes the bug in global pooling. * WIP unit tests (apache#25) * WIP unit tests * some backward items initialized * Make more C++ unit tests work for batch norm (apache#28) * WIP unit tests * some backward items initialized * some backward items initialized * some backward items initialized * first unit test working * Working on types * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * . * . * some tests working * fix input data * hangle gpu<->cpu for setting values * gpu working * gpu working * CAccessAsCPU class * Fix varying type in AccessAsCPU * starting to add channel axis tests * TestChannelAxisSimple * TestChannelAxisSimple * run bidirectional * run bidirectional * run bidirectional * CLEANUP * CLEANUP * .. * noaxis * .. * lint * revert * revert * Fix lint complains. * Fix a minor problem in Makefile. * fix GPU pooling. * Disable modelzoo inference tests. * update accuracy checks for MKLDNN. * Fix MKLDNN pooling for global pooling. * Fix Jenkins. * Fix a bug in Jenkins. * Fix Jenkins

* Remove MKL code. * Integrate MKLDNN. Update MXNet for MKLDNN. Enable MKLDNN Relu. Fix a compilation error. Change Makefile for MKLDNN. Remove infer storage in convolution. Update MXNet for MKLDNN. Support MKLDNN storage type in python. Update activation. Add MKLDNN base classes. Implement MKLDNN fully connected. Add MKLDNN convolution. Update MKLDNN interface in NDArray. MKLDNN convolution handle CreateMKLDNNData failure. Add another GetMKLDNNData in NDArray. Have mkldnn to define the data format. Create output MKLDNN memory explicitly for FC. Fix a bug in NDArray. Fix a bug in GetWeightDesc. Convert data layout if necessary in FC. remove unnecessary print in MKLDNN convolution. Add MKLDNN deconvolution. Add MKLDNNStream to manage primitives and memories. Use MKLDNNStream to register memory in NDArray. Use MKLDNNStream to manage resources in operators. Handle kAddTo in MKLDNN operators. Fix a bug in deconvolution. Fix bugs in NDArray. Revert "Fix bugs in NDArray." This reverts commit f5624a4. Fix a bug in NDArray. Fix a bug in NDArray. Reorder MKLDNN memory to default format in SetTBlob. Disable MKLDNN correctly. Fix a bug in activation. Reshape of NDArray supports MKLDNN. Fix a memory ref bug in NDArray. Reshape NDArray in MKLDNN FullyConnected. Fix data format conversion. Create MKLDNN NDArray in python. Support Slice for MKLDNN NDArray. Reduce the overhead of summing the result to the output array. Avoid unnecessary memory copy in NDArray. Fix a bug in data reordering. Fix a bug in NDArray. Don't hard code MKLDNN type. Support dilation in MKLDNN convolution. Fix a bug in sum results. Rewrite GetMKLDNNData. Add prepare_mkldnn.sh Enable MKLDNN activation. Fix a bug on FullyConnected. Handle 3 dims for MKLDNN NDArray. Fix a bug in MKLDNN FC. Support MKLDNN storage in KV store. Fix a bug in executor for non-default NDArray. Fix a link error in cast_storage.cc. Remove unnecessary function def Fall back to def storage if the type isn't supported by MKLDNN. Use NDArray for MKLDNN in python. Reshape output of MKLDNN convolution. Fix a bug in NDArray. Support more operations in MKLDNN NDArray. Fix a bug in deconvolution. Fix bugs in MKLDNN deconvolution. We still need to compute bias correctly. Have elemwise binary ops to fall to default for MKLDNN. Limit the cases that MKLDNN operations are called. Force the layout of mkldnn::memory from NDArray. Add MKLDNN softmax. Fix output storage type of MKLDNN softmax. Add MKLDNN sum. Fix a bug in elemwise sum. Fix a bug in MKLDNN softmax. Fix a bug in imperative. Clean up dispatch modes. Remove redundant code. MKLDNN Pooling Op integration MKLDNN Pooling Op integration add missing file fix mkldnn pooling op workspace issue handle workspace in MKLDNN pooling correctly. Use a non-MKLDNN op for testing. Allow to share arguments and their gradients between executors. Avoid using MKLDNN pooling when it's not supported. Support MKLDNN properly. Choose MKLDNN softmax more carefully. Fix a bug in MKLDNN pooling. Fall back if MKLDNN pooling isn't supported. Fix a bug in Slice of NDArray. Use int32 for workspace memory. Exclude MKLDNN act with tanh. Have two Reshape functions in NDArray. Copy data for NDArray with diff shapes. Add MKLDNN copy. Add MKLDNN version of elemwise_add. Add MKLDNN version of Flatten. add mkldnn surport for concat simplify MKLDNN Flatten. Enalbe MKLDNN deconvolution with bias. Fix a bug in CuDNN deconvolution. avoid using MKLDNNStorage when it's not defined. Remove ./cudnn_lrn-inl.h Fix for make lint. add mkldnn surport for concat fix the coding style for pr of mkldnn concat Only add input data for MKLDNN concat backward Remove unnecessary TODO. remove unnecessary __repr__ in MKLNDArray. better condition check for readability. Use macro when including mkldnn.hpp. Revert "Use CoreOpRunner for refactored Ops." This reverts commit a28586f. Fix a bug in test core. Limit MKLDNN ops being used. Fix complains from "make pylint" Move ContainStorage to common/utils.h Limit MKLDNN concat being used. Add license. Fix amalgamation Fix compilation error in mkldnn_ops-inl.h Fix a bug in deconvolution. Fix a bug in pooling. MKLDNN ops allocates temp mem. Fix a bug in pooling. Allocate align memory from temp space. Have parameter gradients stored in the default storage. Handle all cases in CopyFrom. Ensure NDArray returns memory with right memory descriptors. use auto to define memory in the operator. Use raw pointer for mkldnn memory. Move more code to mkldnn_base.cc Fix a compilation error. Address review comments. fix a bug in activation backward. Miss a macro in mkldnn_base.cc Fix a bug in data iterator in examples. Avoid memory allocation in ReshapeMKLDNN. Avoid memory allocation in storage cast. Fix a bug in cast storage. Handle sliced MKLDNN NDArray. Use memcpy if NDArray uses default format. Revert "Limit MKLDNN ops being used." This reverts commit 75e2ae5. Enable mkldnn act backward has the same input layout. Fix a bug in mkldnn activation. Use MKLDNN sum in more cases. Improve perf of reorder. Avoid memory reorder in conv and deconv. Avoid unnecessary storage cast in fallback path. Revert "Use MKLDNN sum in more cases." This reverts commit 7a21ebc. Handle sliced ndarray in more cases. Fix a complain from make lint. Update Jenkins to test MKLDNN. debug compiling mkldnn. Use MKLDNN sum in more cases. Add mkldnn as a submodule. Compile with mkldnn in 3rdparty. Fix some coding styles. write the path to mkldnn lib in libmxnet.so. use rpath with $ORIGIN. Pack all lib files in Jenkins. pack and unpack mxnet with MKLDNN. Update Jenkinsfile Update Jenkinsfile Add mkldnn batch normalization Fix bugs in BN. Avoid memory allocation in MKLDNNCopy. only use MKLDNN BatchNorm for special cases. MKLDNN BatchNorm doesn't work well on the default layout. Add MKL-DNN based LRN Code Style Changes Fix a bug in BN. Fix a bug in LRN. Handle non-default storage in memory plan. Fix coding style. Fix a compilation error without mkldnn. Fix some coding styles for batch norm Improve forward of convolution. Add openmp and simd support to BN operator Retrieve MKLDNN Conv primitive based on signature. Retrieve Act primitive based on its signature. Fix a bug in pooling. Diable some MKLDNN activation and pooling. Cast MKLDNN storage with diff data type. Check if it's a view of NDArray. Reshaped and sliced arrays share the same chunks. Implement caching MKLDNN Act correctly. Fix a bug in check_consistency. Fix a potential bug when destroying NDArray. Fix bugs when allocating mem in NDArray. Fix coding style. Add micro when using mkldnn in ndarray. Fix a compilation error. Fix a bug in concat. Remove MKLDNNStorage. handle diff layouts in CopyFromToDnsImpl. Fallback correctly. Force weight grad to use default layout. Reorder weight arrays in (de)conv for faster inference. Avoid caching TBlob from NDArray. This commit may add some overhead of managing NDArray for each fallback. Fix a bug in Flatten. handle ndarray with def layout in mkldnn BN correctly. Align to page when mkldnn is enabled. Use default mem alloc for mkldnn. Reuse NDArrays. Support WriteInplace for sum. fix complains from "make lint". Avoid reallocation in NDArray. Handle weight arrays with special MKLDNN layouts. Remove unnecessary GetWeights. Fix compilation error without MKLDNN. Fix a bug in (de)conv for weight arrays. Fix a minor bug in MKLDNN conv. Fix a bug in MKLDNNOpSignature. Reimplement fallback for MKLDNN ops. Fix a bug in FallbackExecutor. Add params in hashcode. Invalidate data in outputs to accelerate. Fix a minor bug. Update mkldnn_base-inl.h Add primitive caching for Pooling forward computation Add hashcode in pooling parameters. Support NDArray copy with types unsupported by MKLDNN. Avoid using MKLDNN concat for negative dimension. Fix make lint complain. Disable mkldnn avg pooling for now. Fix a compile warning. Fix compile error when MKLDNN is disabled. OP primitive cache: use memory as signature for MKLDNN storage type Remove MKLDNN array in python. Disable Clang tests in Jenkins. Use mklml dockers to test mkldnn. Update MKLDNN repo to zhengda's mkldnn repo. Update MKLDNN repo to ashok's. Fix a bug in fallback. Change avg pooling algorithm to pooling_avg_include_padding Fix a code style in mkldnn pooling. Temp fix a bug in FC. Revert "Disable Clang tests in Jenkins." This reverts commit b4efa8f. Rebase and Refactor deconv (#20) * rebase to Da,Zheng refactor branch Jan.14, add signature for mkldnn Deconv and modify classMKLDNNDeconvForward * fix make lint complains A simple way of caching BN inference. cache BN forward for both training and inference. Fix some minor problems in BN. Fix a bug in caching BN. force to build with avx2 in Jenkins. Remove the remaining MKLDNNStorageType Some minor updates in NDArray. a lot of updates to address comments. minor changes. * Use NNVM interface. Use NNVM interface for upsampling. Use NNVM interface for convolution. Use NNVM interface for deconvolution. Use NNVM interface for FullyConnected. Move NNVM interface to batch norm. Use NNVM interface for depthwise convolution. Use NNVM interface for softmax activation. Use NNVM interface for pooling. use NNVM interface for dropout. Use NNVM interface for activation. Use NNVM interface for CuDNN batch norm. Use NNVM interface for CuDNN pooling. Use NNVM interface for CuDNN softmax activation. Use NNVM interface for CuDNN activation. Use NNVM interface for CuDNN convolution. Use NNVM interface for CuDNN deconvolution. Move concat to nn/ Use NNVM interface for concat. Fix headers in concat. Move lrn to nn/. Use NNVM interface for LRN. Fix a compilation error in convolution. Fix a compilation error in activation. Fix coding style. Fix coding style for make lint. use enums in batch norm. Use CoreOpRunner for refactored Ops. Make FullyConnected stateless. Make upsampling stateless. Make pooling stateless. Make batchnorm stateless. Make SoftmaxActivation stateless. Fix a code style problem. pass amalgamation test for batch norm. pass amalgamation test for dropout. Get convolution ops from a function. Fix compilation errors for GPU. Fix thread local in diff platforms. Avoid using thread_local for non-CuDNN conv/deconv. Remove TODO in deconv. Fix a bug in batch norm. Fix a bug in fully connected. Don't set #inputs for backward convolution. Revert "Make pooling stateless." * revert modification in test_executor. * Fix a bug in FlattenStorageType. * Remove BN debug. * Remove remaining MXNET_USE_MKL2017 * Remove unused code in pooling. * Fixing bugs in gtests. * Fix lint errors. * a lot of minor updates to address comments. * Fix coding style in MKLDNN Pooling (#22) * revert the code change in the previous code refactor. * Fix a bug in pooling. * LRN coding style changes (#21) * LRN coding style change * Add const for local variables * Add req for LRN forward * rebase code * align API interface * revert modification in test_executor. * cast storage with MKLDNN properly. * Minor updates to address comments. * some minor updates. * Switch to the master branch of MKLDNN. * Minor updates to address comments. * Update activation.cc * Fix a bug in convert NDArray. * Add gluon model zoo tests. * Update GPU tests on model zoo. * Avoid using mobilenet for GPU tests with gluon models. mobilenet can't pass the test even without MKLDNN. * Update GPU tests on gluon. * change cmake to compile MKLDNN. * update cmake for MKLDNN. * Implement align myself. * Switch to intel/mkl-dnn. * Fix errors in align unittest. * Add unit test for LRN. * fix a compilation error. * use storage_type_assign to determine storage type. * avoid global pooling in mkldnn. There is a bug in global pooling in mkldnn. * compare all MKLDNN ops with native impls. add MXNET_MKLDNN_DEBUG to control the test. * Fix a bug in testing correctness. * print the name of buggy operator. * undo some modifications. * Fix a bug on reshaped array. * avoid testing outputs with NullOp. * turn on MKLDNN tests in Jenkins. * print each operator in MKLDNN tests. * rename test_gluon_model_zoo.py * Create hashcode for operator parameters properly. * Add USE_MKL2017 back. * Print warning messages. * move batchnorm tests to nnvm interface. * Delete batchnorm v1 tests. * Get inputs and outputs in batchnorm tests. * disable batchnorm tests for now. * Fix GPU tests on gluon model zoo. * Fix lint complains in tests. * Remove simd from openmp instructions in BatchNorm (#24) * Remove warnings. * Fix MKLDNN 1st compile failure issue (#23) * Fix compilation errors. * Remove ARCH_OPT in Jenkins. * Revert "avoid global pooling in mkldnn." This reverts commit f6efd34. * Move to the latest MKLDNN. This fixes the bug in global pooling. * WIP unit tests (#25) * WIP unit tests * some backward items initialized * Make more C++ unit tests work for batch norm (#28) * WIP unit tests * some backward items initialized * some backward items initialized * some backward items initialized * first unit test working * Working on types * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * . * . * some tests working * fix input data * hangle gpu<->cpu for setting values * gpu working * gpu working * CAccessAsCPU class * Fix varying type in AccessAsCPU * starting to add channel axis tests * TestChannelAxisSimple * TestChannelAxisSimple * run bidirectional * run bidirectional * run bidirectional * CLEANUP * CLEANUP * .. * noaxis * .. * lint * revert * revert * Fix lint complains. * Fix a minor problem in Makefile. * fix GPU pooling. * Disable modelzoo inference tests. * update accuracy checks for MKLDNN. * Fix MKLDNN pooling for global pooling. * Fix Jenkins. * Fix a bug in Jenkins. * Fix Jenkins

zheng-da requested review from cjolivier01, marcoabreu and szha as code owners February 2, 2018 00:59

marcoabreu reviewed Feb 2, 2018

View reviewed changes

marcoabreu suggested changes Feb 2, 2018

View reviewed changes

pengzhao-intel mentioned this pull request Feb 3, 2018

[call for contribution] Improving CPU performance #2986

Open

zheng-da force-pushed the mkldnn branch from 034e9d5 to 03d0283 Compare February 6, 2018 01:09

zheng-da force-pushed the mkldnn branch from be82804 to 94ccd1c Compare February 13, 2018 22:49

zheng-da and others added 22 commits February 13, 2018 23:03

Remove MKL code.

9e071f8

revert modification in test_executor.

6ca9354

Fix a bug in FlattenStorageType.

56e8c2c

Remove BN debug.

c3d1683

Remove remaining MXNET_USE_MKL2017

27caf85

Remove unused code in pooling.

585d837

Fixing bugs in gtests.

6a54f28

Fix lint errors.

ffdd860

a lot of minor updates to address comments.

3ca549d

Fix coding style in MKLDNN Pooling (#22)

8abbcc1

revert the code change in the previous code refactor.

9203748

Fix a bug in pooling.

407a972

LRN coding style changes (#21)

d4e7c58

* LRN coding style change * Add const for local variables * Add req for LRN forward * rebase code * align API interface * revert modification in test_executor.

cast storage with MKLDNN properly.

856aec9

Minor updates to address comments.

b8deaab

some minor updates.

bc1d164

Switch to the master branch of MKLDNN.

b77ae3e

Minor updates to address comments.

1766dc2

Update activation.cc

d0ae271

Fix a bug in convert NDArray.

ff78b02

larroy reviewed Feb 15, 2018

View reviewed changes

piiswrong merged commit c3e3a83 into apache:master Feb 15, 2018

pengzhao-intel mentioned this pull request Feb 19, 2018

Add MKL-DNN based deep learning framework oneapi-src/oneDNN#192

Merged

iblislin mentioned this pull request Mar 7, 2018

Compiler bus error during build dmlc/MXNet.jl#418

Closed

reminisce reviewed Mar 22, 2018

View reviewed changes

ThomasDelteil mentioned this pull request Mar 27, 2018

plot_network issue with LRN #10282

Closed

chinakook mentioned this pull request Mar 30, 2018

RCNN example fails for using latest mxnet #9823

Closed

reminisce mentioned this pull request Apr 4, 2018

[MXNET-266] Fix cudnn_conv and cudnn_deconv deadlock #10392

Merged

7 tasks

fullfanta mentioned this pull request Apr 30, 2018

NNPACK can't be compiled at 1.1.0 1.2.0 and master branch #10579

Open

wkcn mentioned this pull request May 12, 2018

cudnn_softmax_activation error in MXNet 1.2.0 #10914

Closed

eric-haibin-lin mentioned this pull request Jun 8, 2018

Enable CUDNN for conv1D #11194

Merged

7 tasks

iblislin mentioned this pull request Aug 8, 2018

import Julia binding #10149

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor operators and add MKLDNN #9677

Refactor operators and add MKLDNN #9677

zheng-da commented Feb 2, 2018

marcoabreu Feb 2, 2018

marcoabreu Feb 2, 2018

marcoabreu Feb 2, 2018

zheng-da Feb 5, 2018

pengzhao-intel Feb 6, 2018

cjolivier01 Feb 6, 2018

marcoabreu Feb 15, 2018

marcoabreu Feb 2, 2018

larroy Feb 15, 2018

marcoabreu commented Feb 15, 2018 •

edited

Loading

piiswrong commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018 •

edited

Loading

marcoabreu commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

marcoabreu commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018 •

edited

Loading

marcoabreu commented Feb 15, 2018 •

edited

Loading

iblislin commented Feb 24, 2018

iblislin commented Feb 24, 2018

zheng-da commented Feb 24, 2018

marcoabreu commented Feb 24, 2018 via email

iblislin commented Feb 25, 2018 •

edited

Loading

marcoabreu commented Feb 26, 2018

reminisce Mar 22, 2018

zheng-da Mar 23, 2018

Refactor operators and add MKLDNN #9677

Refactor operators and add MKLDNN #9677

Conversation

zheng-da commented Feb 2, 2018

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcoabreu commented Feb 15, 2018 • edited Loading

piiswrong commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018 • edited Loading

marcoabreu commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

marcoabreu commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018

cjolivier01 commented Feb 15, 2018 • edited Loading

marcoabreu commented Feb 15, 2018 • edited Loading

iblislin commented Feb 24, 2018

iblislin commented Feb 24, 2018

zheng-da commented Feb 24, 2018

marcoabreu commented Feb 24, 2018 via email

iblislin commented Feb 25, 2018 • edited Loading

marcoabreu commented Feb 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcoabreu commented Feb 15, 2018 •

edited

Loading

cjolivier01 commented Feb 15, 2018 •

edited

Loading

cjolivier01 commented Feb 15, 2018 •

edited

Loading

marcoabreu commented Feb 15, 2018 •

edited

Loading

iblislin commented Feb 25, 2018 •

edited

Loading