[NPU] apply npu_identity to conv bn and copy2cpu, test=develop #48039

qili93 · 2022-11-16T08:06:38Z

PR types

Others

PR changes

Others

Describe

When Tensor is in NPU storage format, need to use Identity OP to transform to origin format and then copy to cpu
Conv and BatchNorm op need to prepare NPU storage format to improve performance on Ascend910

paddle-bot · 2022-11-16T08:06:42Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

qili93 · 2022-11-25T01:52:49Z

paddle/fluid/pybind/eager_method.cc

+        temp_tensor = npu_identity_ad_func(self->tensor, -1);
+        dense_tensor =
+            std::dynamic_pointer_cast<phi::DenseTensor>(temp_tensor.impl());
+      }


这里是为了 Eager MODE 下当调用 tensor_method_numpy ，将 tensor copy to cpu 的时候调用 npu_identity_ad_func 保证输出算子是正常的 NCHW 格式，而不是 NPU 的 ACL_FORMAT_NC1HWC0 特殊格式

qili93 · 2022-11-25T01:54:12Z

paddle/fluid/pybind/tensor_py.h

+          std::dynamic_pointer_cast<phi::DenseTensor>(tensor_out.impl());
+      tensor_buf_ptr = dense_tensor->data();
+    }
+


这里是为了当 tensor 通过 tensor.numpy() 接口获取值的时候，在 TensorToPyArray 调用 npu_identity_ad_func 转化为正常的输出格式。

qili93 · 2022-11-25T01:55:00Z

paddle/phi/kernels/npu_identity_kernel.cc

-          << out->dims() << ", please avoid using this kernel!";
-  *out = phi::EmptyLike<T, Context>(dev_ctx, *out);
+  VLOG(4) << "npu_identity op is only for NPU, please avoid using this kernel!";
+  out->ShareDataWith(x);


正常的 CPU/GPU kernel 改为使用 share data，保证输入输出的值一致

qili93 · 2022-11-25T01:56:58Z

python/paddle/fluid/dygraph/varbase_patch_methods.py

+            new_ivar = self._grad_ivar()
+            if 'npu' in get_all_custom_device_type():
+                new_ivar = paddle.incubate._npu_identity(x=new_ivar, format=-1)
+            new_ivar = new_ivar._copy_to(core.CPUPlace(), True)


这里是为了 op test 单测文件修改，其中 a = inputs_grad_dict[inputs_to_check_name].gradient() 调用获取反向输出，这里需要调用 _npu_identity 获得正常格式数据之后再拷贝到 CPU

qili93 · 2022-11-25T01:57:20Z

python/paddle/fluid/tests/unittests/test_npu_identity_op.py


-        np.testing.assert_allclose(out.shape, self.shape, rtol=1e-08)
+        np.testing.assert_allclose(out.numpy(), self.x, rtol=1e-08)


修改单测，比较 npu_identity 的输出和输入保持一致

qili93 · 2022-11-25T01:57:45Z

python/paddle/incubate/tensor/manipulation.py

@@ -52,7 +52,7 @@ def _npu_identity(x, format=-1):
        return _C_ops.npu_identity(x, format)

    if _in_legacy_dygraph():
-        return _legacy_C_ops.npu_identity(x, format)
+        return _legacy_C_ops.npu_identity(x, 'format', format)


修复 _in_legacy_dygraph 下的算子调用问题

qili93 · 2022-11-25T01:58:46Z

python/paddle/nn/functional/conv.py

+                    bias_storage = _C_ops.npu_identity(
+                        bias, 3
+                    )  # ACL_FORMAT_NC1HWC0 = 3
+                    bias_storage._share_underline_tensor_to(bias)


NPU CONV 算子的输出为 ACL_FORMAT_NC1HWC0 格式，在 NPU 下这里需要将 bias 也转为同一格式计算 Add

hardware related code can NOT exist in the API. If it exists temporarily, needs to be marked with TODO and cleaned up later.

qili93 · 2022-11-25T02:02:01Z

python/paddle/nn/functional/conv.py

+                            bias, 3
+                        )  # ACL_FORMAT_NC1HWC0 = 3
+                        bias_storage._share_underline_tensor_to(bias)
+                return _C_ops.add(pre_bias, bias)


同上，修改 BIAS 为同一个格式进行计算，即原来 bias.shape = [6], 需要先变成 NCHW的格式 [1,1,6,1] 再通过 npu_identity 对里面的数据进行重排得到 ACL_FORMAT_NC1HWC0 的格式 [1, 1, 1, 1, 16]，然后输入 NPU 的Add算子进行计算

hardware related code can NOT exist in the API. If it exists temporarily, needs to be marked with TODO and cleaned up later.

qili93 · 2022-11-25T02:02:37Z

python/paddle/nn/layer/norm.py

+                bias_trans._share_underline_tensor_to(self.bias)
+                mean_trans._share_underline_tensor_to(self._mean)
+                var_trans._share_underline_tensor_to(self._variance)
+


NPU BN 算子的所有输入参数需要预先转化为 ACL_FORMAT_NC1HWC0 的格式

hardware related code can NOT exist in the API. If it exists temporarily, needs to be marked with TODO and cleaned up later.

jeff41404

LGTM

qili93 marked this pull request as draft November 16, 2022 08:08

qili93 mentioned this pull request Nov 16, 2022

[NPU] add _npu_identity op and api, test=develop #47850

Merged

qili93 force-pushed the apply_npu_identity branch from 2ed79f7 to c9fd2b2 Compare November 23, 2022 01:53

qili93 marked this pull request as ready for review November 23, 2022 01:53

qili93 force-pushed the apply_npu_identity branch 2 times, most recently from 2329fb9 to ad49fa3 Compare November 24, 2022 10:04

qili93 added 2 commits November 24, 2022 10:05

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop

ad49fa3

update npu identity to share data with x, test=develop

9f0704b

qili93 commented Nov 25, 2022

View reviewed changes

address review comments, test=develop

b9a9c2c

jeff41404 approved these changes Nov 25, 2022

View reviewed changes

qili93 merged commit 32143f4 into PaddlePaddle:develop Nov 28, 2022

qili93 deleted the apply_npu_identity branch November 28, 2022 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop #48039

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop #48039

qili93 commented Nov 16, 2022

paddle-bot bot commented Nov 16, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

jeff41404 Nov 25, 2022

qili93 Nov 25, 2022

jeff41404 Nov 25, 2022

qili93 Nov 25, 2022

qili93 Nov 25, 2022

jeff41404 Nov 25, 2022

qili93 Nov 25, 2022

jeff41404 left a comment


		np.testing.assert_allclose(out.shape, self.shape, rtol=1e-08)
		np.testing.assert_allclose(out.numpy(), self.x, rtol=1e-08)

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop #48039

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop #48039

Conversation

qili93 commented Nov 16, 2022

PR types

PR changes

Describe

paddle-bot bot commented Nov 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeff41404 left a comment

Choose a reason for hiding this comment