Adapting device-specific Extra Attributes for the PHI kernel #46342

chenwhql · 2022-09-21T05:02:27Z

PR types

New features

PR changes

Others

Describe

Adapting device-specific Extra Attributes for the PHI kernel

After cleaned up most of the Extra parameters in OpMaker, it is necessary to remove the parameters belonging to the Extra range in the PHI Kernel parameter list. This is a necessary process to standardize the definition of the Kernel, and can also reduce the understanding cost of other hardware manufacturers accessing the Paddle Kernel. For example:

The function signature of the current conv2d kernel is as follows:

template <typename T, typename Context>
void ConvKernel(const Context& dev_ctx,
                const DenseTensor& input,
                const DenseTensor& filter,
                const std::vector<int>& strides,
                const std::vector<int>& paddings,
                const std::string& padding_algorithm,
                int groups,
                const std::vector<int>& dilations,
                const std::string& data_format,
                bool use_addto,
                int workspace_size_MB,
                bool exhaustive_search,
                DenseTensor* out);

And the Python API signature of conv2d is as follows:

def conv2d(x,
           weight,
           bias=None,
           stride=1,
           padding=0,
           dilation=1,
           groups=1,
           data_format="NCHW",
           name=None):

We can found that the following three parameters in the Kernel signature have nothing to do with the python api

                bool use_addto,
                int workspace_size_MB,
                bool exhaustive_search,

In fact, these three parameters are dedicated to the cudnn kernel of conv2d, but since there can only be one signature of the Kernel, CPU, GPU, XPU in paddle, and NPU. MLU in the CustomDevice need to have these three parameters for conv2d kernel. In fact, for these hardware, these three parameters are meaningless, which increases the understanding cost of heterogeneous Kernel development, so we need to remove such parameters. If you add MKLDNN, conv2d has more such parameters like follows:

- op : conv2d
  backward : conv2d_grad
  extra :
    attrs : [bool is_test = false, bool use_cudnn = true, bool fuse_relu_before_depthwise_conv = false, bool use_mkldnn = false,
             bool use_quantizer = false, str mkldnn_data_type = "float32", bool fuse_relu = false,
             str fuse_activation = "", float fuse_alpha = 0.0f, float fuse_beta = 0.0f, bool use_addto = false,
             bool fuse_residual_connection = false, float Scale_in = 1.0f, float Scale_out = 1.0f,
             float Scale_in_eltwise = 1.0f, 'float[] Scale_weights = {1.0f}', bool force_fp32_output = false,
             int workspace_size_MB = platform::GetDefaultConvWorkspaceSizeLimitMB(), bool exhaustive_search = false]

We cannot add such parameters to the public Kernel signature to affect the development experience of many hardware kernels. Since the parameter dedicated to a certain hardware or acceleration library, it should be passed in in a dedicated way.

However, this removal action needs to ensure that the current MKLDNN and CUDNN Kernel functions are compatible. For the Kernels of MKLDNN and CUDNN, these removed parameters are still needed at present, so they are removed from the parameter list, but need to passed into the kernel by other ways, this passing process has the following limitations:

It cannot be passed in through global variables (this goes against the design principles of the phi kernel function)
It cannot be passed in by any method that changes the declaration of the kernel function

Under the premise of the existence of the above restrictions, we can only pass it in through the existing parameters of the Kernel. At present, it seems more appropriate to pass in the Context dedicated to the Kernel:

MKLDNN-specific parameters are passed in through OneDNNContext
CUDNN-specific parameters are passed in through GPUContext (there is no GPUDNNContext for the time being)

This PR adopts this scheme to adapt the existing MKLDNN and CUDNN dedicated parameters.

Specifically:

Add thread_local dnn_attrs_ member to store special parameters in OneDNNContext and GPUContext
Normalize the parameter list of the current conv2d kernel, obtain the cudnn-specific parameters through dev_ctx in the CUDNN Kernel, and verify it through unittests
According to the normalized conv2d kernel signature, migrate mkldnn's conv2d kernel to phi, and also obtain mkldnn-specific parameters through dev_ctx, and verify it through unittests

Other Changing:

Add TypeInfo into all DeviceContext for judging Context type by enum value
Polish the marco in enforce.h, move to suitable location
Fix conv argument name typo
The NPU and MLU conv2d kernel signature also need to be updated, Normalize conv2d kernel signature PaddleCustomDevice#151

TODO:

Do not remove the original ConvMKLDNN(Grad)Kernel for the time being in this PR, and remove it after both depthwise_conv2d and conv3d are migrated

paddle-bot · 2022-09-21T05:02:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… adapto_extra_attr_in_phi

…l/Paddle into adapto_extra_attr_in_phi

… adapto_extra_attr_in_phi

zyfncg

LGTM

Silv3S · 2022-10-26T13:37:30Z

paddle/fluid/operators/ops_extra_info.h

+        {"fuse_alpha", ExtraAttrProperty::ONEDNN},
+        {"fuse_beta", ExtraAttrProperty::ONEDNN},
+        {"fuse_brelu", ExtraAttrProperty::ONEDNN},
+        {"fuse_brelu_threshold", ExtraAttrProperty::ONEDNN},


fuse_brelu or fuse_brelu_threshold are no longer used by any oneDNN kernel. Is there any compatibility issue preventing us from deleting these attrs?

thx, I deleted these two attributes, if there are other attributes that are deprecated, they can also be removed

We will update this list along with second phase of kernel migration

Silv3S · 2022-10-26T13:40:02Z

paddle/fluid/operators/ops_extra_info.h

+        {"use_cudnn", ExtraAttrProperty::SCHEDULE},
+        {"use_mkldnn", ExtraAttrProperty::SCHEDULE},
+        // ONEDNN dedicated attributes
+        {"Bias", ExtraAttrProperty::ONEDNN},


Can new extra oneDNN-specific attributes be added in future?

In the process of kernel migration, if there are missing attributes, you can add them temporarily . However, in the long run, it is recommended to add fusion op and kernel, and replace the standard op with fusion op in the pass processing stage to support additional functions, so as to avoid the increasingly complex implementation of the basic Op kernel.

paddle/phi/kernels/onednn/conv_grad_kernel.cc

Silv3S · 2022-10-26T14:06:25Z

We've run first performance and functional tests and this PR passed everything with expected results. We will run one more test on other platform, to make sure that everything is ok. Tests will be finished by the end of this week.
Meanwhile we tried to migrate softmax kernel with changes from this branch and everything is working as expected 👍 #47339

Co-authored-by: Sławomir Siwek <slawomir.siwek@intel.com>

…l/Paddle into adapto_extra_attr_in_phi

… adapto_extra_attr_in_phi

chenwhql · 2022-10-31T05:48:58Z

@Silv3S Hello, are the tests finished?

Silv3S · 2022-10-31T14:31:36Z

LGTM
There are small differences in accuracy (some positive, some negative). We will run more test after this PR is merged, as they are more stable.

qili93

LGTM

XiaoguangHu01

LGTM

sfraczek · 2022-11-02T10:08:19Z

paddle/phi/backends/onednn/onednn_context.cc

+  const std::vector<std::string>& GetInputsName(
+      const std::string& input) const {
+    auto it = inputs_name_.find(input);
+    PADDLE_ENFORCE_NE(it,
+                      inputs_name_.end(),
+                      phi::errors::NotFound(
+                          "OneDnnContext does not have the input %s.", input));
+    return it->second;
+  }


I think name of this function is similar to SetInputsName but getter doesn't return what setter is setting which might be confusing at first sight. This could have a more distinct or precise name.
The same goes for SetOutputsName and GetOutputsName.

sfraczek · 2022-11-02T10:23:23Z

paddle/phi/backends/onednn/onednn_context.cc

+  // Holds some attributes only used by the onednn kernel calculation
+  // Since original mkldnn op kernel directly adds the operations that require
+  // fusion to the native kernel operations, and uses the attribute `fuse_xxx`
+  // to control, for onednn, there will be some attributes that seem to be
+  // independent of the device are also saved here.
+  // Here, the operation of fusion needs to be implemented separately as
+  // a fusion op and kernel, instead of patching it to a basic operation.
+  // Because DeviceContext is a global singleton, you need to ensure thread
+  // safety, use the thread_local variable


Suggested change

// Holds some attributes only used by the onednn kernel calculation

// Since original mkldnn op kernel directly adds the operations that require

// fusion to the native kernel operations, and uses the attribute `fuse_xxx`

// to control, for onednn, there will be some attributes that seem to be

// independent of the device are also saved here.

// Here, the operation of fusion needs to be implemented separately as

// a fusion op and kernel, instead of patching it to a basic operation.

// Because DeviceContext is a global singleton, you need to ensure thread

// safety, use the thread_local variable

// Holds some attributes only used by the onednn kernel calculation.

// Since original onednn op kernel directly adds the operations that require

// fusion to the native kernel operations, and uses the attribute `fuse_xxx`

// to control, for onednn, there will be some attributes that seem to be

// independent of the device are also saved here.

// Here, the operation of fusion needs to be implemented separately as

// a fusion op and kernel, instead of patching it to a basic operation.

// Because DeviceContext is a global singleton, you need to ensure thread

// safety, use the thread_local variable

chenwhql added 2 commits September 20, 2022 11:20

add extra attr property set

00c2aca

add type_info for all context

28e9710

chenwhql and others added 27 commits September 21, 2022 11:40

add onednn context to all context

cd630f0

fix context compile error

d633672

simplify conv kernel args

d651bf9

pass runtime attr into dev_ctx

84c56fa

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

24ea5b4

… adapto_extra_attr_in_phi

fix marco error

fc67372

clear conv_grad_kernel extra args

ed7b59f

merge conv_grad_grad into conv_grad

23cab07

clear conv2d_grad_grad extra attrs

123edd2

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d29a519

… adapto_extra_attr_in_phi

clear yaml and eager extra attr

853b1db

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

62ce48b

… adapto_extra_attr_in_phi

fix conv1d error

de52450

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2c55350

… adapto_extra_attr_in_phi

change to thread local

03f6893

fix npu compile failed

49bcebe

try to fix windows compile failed

ada7d0c

add conv2d onednn phi kernel

22ccfa9

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

9602334

… adapto_extra_attr_in_phi

fix ci bugs (#36)

6331512

fix compile bugs (#38)

37afcf6

fix extra input transform bug (#39)

39e2529

support dynamic created attr (#40)

16d595c

resolve conflict with develop

0d61412

Merge branch 'adapto_extra_attr_in_phi' of https://github.com/chenwhq…

722b256

…l/Paddle into adapto_extra_attr_in_phi

resolve conflict with develop

8d59887

reset extra info gen code

c0cdb15

remove vector inputnames creating

2ea1a2b

chenwhql dismissed stale reviews from jiahy0825, ronny1996, YuanRisheng, and qili93 via 2ea1a2b October 25, 2022 08:09

chenwhql added 2 commits October 25, 2022 11:21

fix map at error

148eebc

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

cbc247b

… adapto_extra_attr_in_phi

zyfncg previously approved these changes Oct 26, 2022

View reviewed changes

Silv3S reviewed Oct 26, 2022

View reviewed changes

paddle/phi/kernels/onednn/conv_grad_kernel.cc Outdated Show resolved Hide resolved

Update paddle/phi/kernels/onednn/conv_grad_kernel.cc

f7d3a66

Co-authored-by: Sławomir Siwek <slawomir.siwek@intel.com>

chenwhql dismissed zyfncg’s stale review via f7d3a66 October 27, 2022 06:03

chenwhql added 5 commits October 27, 2022 06:07

remove useless extra attrs

1d99ce6

Merge branch 'adapto_extra_attr_in_phi' of https://github.com/chenwhq…

bfe9206

…l/Paddle into adapto_extra_attr_in_phi

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

fcb1fa6

… adapto_extra_attr_in_phi

resolve conflict

60ca750

replace mkldnn_engine by onednn_engine

64ef9fd

Silv3S mentioned this pull request Oct 27, 2022

Migrate oneDNN operators to under PHI #45308

Closed

YuanRisheng approved these changes Nov 1, 2022

View reviewed changes

qili93 approved these changes Nov 1, 2022

View reviewed changes

XiaoguangHu01 approved these changes Nov 1, 2022

View reviewed changes

zyfncg approved these changes Nov 1, 2022

View reviewed changes

chenwhql merged commit c923e6c into PaddlePaddle:develop Nov 1, 2022

$sfraczek$

sfraczek reviewed Nov 2, 2022

View reviewed changes

GreatV mentioned this pull request Nov 9, 2022

[phi] Decoupled phi from fluid tracking issue #47615

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapting device-specific Extra Attributes for the PHI kernel #46342

Adapting device-specific Extra Attributes for the PHI kernel #46342

chenwhql commented Sep 21, 2022 •

edited

Loading

paddle-bot bot commented Sep 21, 2022

zyfncg left a comment

Silv3S Oct 26, 2022

chenwhql Oct 27, 2022

Silv3S Oct 27, 2022

Silv3S Oct 26, 2022

chenwhql Oct 27, 2022

Silv3S commented Oct 26, 2022

chenwhql commented Oct 31, 2022

Silv3S commented Oct 31, 2022

qili93 left a comment

XiaoguangHu01 left a comment

$@sfraczek$ sfraczek Nov 2, 2022 •

edited

Loading

$@sfraczek$ sfraczek Nov 2, 2022

Adapting device-specific Extra Attributes for the PHI kernel #46342

Adapting device-specific Extra Attributes for the PHI kernel #46342

Conversation

chenwhql commented Sep 21, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Sep 21, 2022

zyfncg left a comment

Choose a reason for hiding this comment

Silv3S Oct 26, 2022

Choose a reason for hiding this comment

chenwhql Oct 27, 2022

Choose a reason for hiding this comment

Silv3S Oct 27, 2022

Choose a reason for hiding this comment

Silv3S Oct 26, 2022

Choose a reason for hiding this comment

chenwhql Oct 27, 2022

Choose a reason for hiding this comment

Silv3S commented Oct 26, 2022

chenwhql commented Oct 31, 2022

Silv3S commented Oct 31, 2022

qili93 left a comment

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

sfraczek Nov 2, 2022 • edited Loading

Choose a reason for hiding this comment

sfraczek Nov 2, 2022

Choose a reason for hiding this comment

chenwhql commented Sep 21, 2022 •

edited

Loading

$@sfraczek$ sfraczek Nov 2, 2022 •

edited

Loading

$@sfraczek$ sfraczek Nov 2, 2022