modify class MKLDNNActForward to new structure #9

rongzha1 · 2018-01-10T05:54:03Z

Description

change class Act to new structure.

Checklist

Essentials

[done] Passed code style checking (make lint)
[done] Changes are complete (i.e. I finished coding on this PR)
[done] All changes have test coverage
[done] For user-facing API changes, API doc string has been updated. For new C++ functions in header files, their functionalities and arguments are well-documented.
[done] To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

[done] Feature1, tests, (and when applicable, API doc)
modify class MKLDNNActForward to new structure

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

…into refactor

This commit may add some overhead of managing NDArray for each fallback.

Conflicts: src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h

2. Add memory into signature; 3. Try to split BatchNorm into .h file and .cc file. Will finish it after backward code is refactored.

Caching primitive for BatchNorm forward computation

TaoLv · 2018-01-10T06:28:29Z

src/operator/nn/mkldnn/mkldnn_act.cc

+    auto out_mem = const_cast<NDArray &>(out_data).CreateMKLDNNData(
+        fwd_pd.dst_primitive_desc());
+
+    if (this->data == nullptr) {


I think it would be better to add initialization and first creation in the constructive function the MKLDNNActForward class. So there is no need to chech nullptr for every iteration.
Besides, please follow the coding style and naming methods we discussed in previous email.

OK move to constructed function;

TaoLv · 2018-01-10T06:30:31Z

src/operator/nn/mkldnn/mkldnn_act.cc

@@ -107,36 +107,46 @@ class MKLDNNActForward {
                   const NDArray &data, const mkldnn::memory &mem): fwd_pd(


There is no need to input both data and mem. You can get mem from data easily.

OK leave data only

TaoLv · 2018-01-10T06:36:15Z

src/operator/nn/mkldnn/mkldnn_act.cc


-    CHECK(fwd_pd.dst_primitive_desc() == output.get_primitive_desc());
-    if (this->out == nullptr)
+    CHECK(fwd_pd.dst_primitive_desc() == out_mem->get_primitive_desc());


Why check this? out_mem is just created in L116 with the dst_primitive_desc. Do you think mkldnn will fail in doing that?

OK will remove it

rongzha1

will modify

rongzha1 · 2018-01-11T01:53:20Z

src/operator/nn/mkldnn/mkldnn_act.cc

@@ -107,36 +107,46 @@ class MKLDNNActForward {
                   const NDArray &data, const mkldnn::memory &mem): fwd_pd(


OK leave data only

rongzha1 · 2018-01-11T01:56:00Z

src/operator/nn/mkldnn/mkldnn_act.cc

+    auto out_mem = const_cast<NDArray &>(out_data).CreateMKLDNNData(
+        fwd_pd.dst_primitive_desc());
+
+    if (this->data == nullptr) {


OK move to constructed function;

rongzha1 · 2018-01-11T01:56:24Z

src/operator/nn/mkldnn/mkldnn_act.cc


-    CHECK(fwd_pd.dst_primitive_desc() == output.get_primitive_desc());
-    if (this->out == nullptr)
+    CHECK(fwd_pd.dst_primitive_desc() == out_mem->get_primitive_desc());


OK will remove it

* [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f2. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e38. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc

* [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f2. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e38. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (zheng-da#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc

* [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f2. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e38. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc

zheng-da added 30 commits December 17, 2017 17:46

Use CoreOpRunner for refactored Ops.

bbb0dba

Make FullyConnected stateless.

1e898a3

Make upsampling stateless.

9854b4b

Make pooling stateless.

5bb99c8

Make dropout stateless.

046eb81

Make batchnorm stateless.

f4c6f1c

Make SoftmaxActivation stateless.

30b5fd9

Fix a code style problem.

95ef90e

pass amalgamation test for batch norm.

921859a

pass amalgamation test for dropout.

485f58f

Get convolution ops from a function.

660968f

Fix compilation errors for GPU.

26e9430

Fix thread local in diff platforms.

5504e2c

Avoid using thread_local for non-CuDNN conv/deconv.

6324176

Remove TODO in deconv.

36c466f

Fix a compilation error in dropout.

6410684

Fix a bug in batch norm.

1fa3898

Fix a bug in fully connected.

588383a

Don't set #inputs for backward convolution.

66a281a

Remove MKL code.

d3ce902

Update MXNet for MKLDNN.

caa3bf3

Enable MKLDNN Relu.

db10bb1

Fix a compilation error.

99c1e08

Change Makefile for MKLDNN.

a6c2c82

Remove infer storage in convolution.

3f75f52

Update MXNet for MKLDNN.

edf6842

Support MKLDNN storage type in python.

c96ca26

Update activation.

1a6e06e

Add MKLDNN base classes.

ca30cac

Implement MKLDNN fully connected.

79c563c

zheng-da and others added 14 commits January 2, 2018 03:28

fix complains from "make lint".

4eeffc9

Avoid reallocation in NDArray.

f4b73db

Merge branch 'refactor' of https://github.com/zheng-da/incubator-mxnet …

c8578c0

…into refactor

Handle weight arrays with special MKLDNN layouts.

ac8f9fd

Remove unnecessary GetWeights.

24200a0

Fix compilation error without MKLDNN.

19d8749

Fix a bug in (de)conv for weight arrays.

18236fc

Fix a minor bug in MKLDNN conv.

1cd8bad

Avoid caching TBlob from NDArray.

9b3c8b2

This commit may add some overhead of managing NDArray for each fallback.

Fix a bug in MKLDNNOpSignature.

c426bfa

Merge remote-tracking branch 'da/refactor' into bn-primitive

623b994

Conflicts: src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h

1. Fix coding style in BatchNorm;

5825191

2. Add memory into signature; 3. Try to split BatchNorm into .h file and .cc file. Will finish it after backward code is refactored.

Merge pull request #5 from TaoLv/bn-primitive

87fd9d5

Caching primitive for BatchNorm forward computation

modify class MKLDNNActForward to new structure

9a6b349

TaoLv reviewed Jan 10, 2018

View reviewed changes

modify code according to PR commnets by Tao

9e65802

rongzha1 commented Jan 11, 2018

View reviewed changes

zheng-da force-pushed the refactor branch 8 times, most recently from ba474be to 5cb7ca0 Compare January 17, 2018 18:50

zheng-da force-pushed the refactor branch from c5b06e8 to 8ba9736 Compare January 19, 2018 02:02

zheng-da force-pushed the refactor branch from 9c1745d to 9719b07 Compare February 2, 2018 00:01

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify class MKLDNNActForward to new structure #9

modify class MKLDNNActForward to new structure #9

rongzha1 commented Jan 10, 2018

TaoLv Jan 10, 2018

rongzha1 Jan 11, 2018

TaoLv Jan 10, 2018

rongzha1 Jan 11, 2018

TaoLv Jan 10, 2018

rongzha1 Jan 11, 2018

rongzha1 left a comment

rongzha1 Jan 11, 2018

rongzha1 Jan 11, 2018

rongzha1 Jan 11, 2018

		@@ -107,36 +107,46 @@ class MKLDNNActForward {
		const NDArray &data, const mkldnn::memory &mem): fwd_pd(

modify class MKLDNNActForward to new structure #9

modify class MKLDNNActForward to new structure #9

Conversation

rongzha1 commented Jan 10, 2018

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rongzha1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment