[XPU] Add fast_layernorm + leaky_relu fusion #57113

frank-oops · 2023-09-08T09:08:31Z

PR types

Performance optimization

PR changes

OPs

Description

add fast_layternorm and act(leaky_relu) fusion

paddle-bot · 2023-09-08T09:08:35Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

hong19860320 · 2023-09-13T01:49:52Z

paddle/fluid/framework/ir/CMakeLists.txt

@@ -291,6 +291,7 @@ if(WITH_XPU)
               ${XPU_PASS_DEPS})
  pass_library(gather_squeeze_pass inference DIR xpu DEPS ${XPU_PASS_DEPS})
  pass_library(fast_where_xpu_fuse_pass inference DIR xpu DEPS ${XPU_PASS_DEPS})
+  pass_library(fln_act_xpu_fuse_pass inference DIR xpu DEPS ${XPU_PASS_DEPS})


layer_norm_act_xpu_fuse_pass

hong19860320 · 2023-09-13T01:52:06Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

@@ -0,0 +1,212 @@
+// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");


layer_norm_act_fuse_pass.cc

hong19860320 · 2023-09-13T01:52:39Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

+After the pass is applied:
+                      x
+                      |
+                 fln_act_xpu


fln 用 layer_norm 替换

hong19860320 · 2023-09-13T01:52:56Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

+graph:
+                      x
+                      |
+                 layernorm_xpu


hong19860320 · 2023-09-13T01:53:17Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

+                   const std::string& name_scope,
+                   const std::string& act_type);
+  // declare operator node's name
+  PATTERN_DECL_NODE(ln);


ln 改成 layer_norm，下同

hong19860320 · 2023-09-13T01:53:40Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

+                    output
+*/
+
+struct FLNActXPUPattern : public PatternBase {


FLN改成 LayerNorm

hong19860320 · 2023-09-13T01:54:16Z

paddle/phi/api/yaml/fused_ops.yaml

@@ -101,6 +101,15 @@
    data_type : x
  optional : bias, x_max

+- op : fln_act_xpu


layer_norm_act_xpu

MuShangCC · 2023-09-13T02:31:01Z

paddle/fluid/framework/ir/xpu/fln_act_xpu_fuse_pass.cc

+                      ->assert_is_op_input("layer_norm", "Scale");
+  auto ln_out = pattern->NewNode(ln_out_repr())
+                    ->AsOutput()
+                    ->assert_is_op_output("layer_norm", "Y");


->assert_is_op_input(act_type_, "X"); 加到这里来
另外，需要判断 ln_out 只有一个 output

hong19860320 · 2023-09-13T08:34:58Z

paddle/phi/infermeta/fusion.cc

@@ -1014,6 +1014,20 @@ void AddCMulXPUInferMeta(const MetaTensor& x,
  out->set_layout(x.layout());
 }

+void FLNActXPUInferMeta(const MetaTensor& x,


LayerNormActInferMeta

hong19860320 · 2023-09-13T08:35:09Z

paddle/phi/infermeta/fusion.h

@@ -226,6 +226,15 @@ void AddCMulXPUInferMeta(const MetaTensor& x,
                         const MetaTensor& w,
                         MetaTensor* out);

+void FLNActXPUInferMeta(const MetaTensor& x,


hong19860320 · 2023-09-13T08:35:20Z

paddle/phi/kernels/fusion/xpu/fln_act_xpu_kernel.cc

+namespace fusion {
+
+template <typename T, typename Context>
+void FLNActXPUKernel(const Context& ctx,


hong19860320 · 2023-09-13T08:35:58Z

paddle/phi/kernels/fusion/xpu/fln_act_xpu_kernel.cc

+    act.hard_sigmoid_slope = act_param;
+  }
+#ifdef PADDLE_WITH_XPU_PLUGIN
+  int r = xpu::plugin::fln_act_fusion(ctx.x_context(),


fln 是否能统一调整为 layer_norm_act_fusion

hong19860320 · 2023-09-14T10:56:42Z

paddle/phi/api/yaml/fused_ops.yaml

+  args : (Tensor x, Tensor scale, Tensor bias, int begin_norm_axis, float epsilon, int act_type, float act_param)
+  output : Tensor(out)
+  infer_meta :
+    func : LayerNormActInferMeta


LayerNormActInferMeta 改成 LayerNormActXPUInferMeta
函数名称和算子名称保持一致，下同

hong19860320 · 2023-09-14T10:59:29Z

paddle/fluid/framework/ir/xpu/layer_norm_act_xpu_fuse_pass.cc

+namespace patterns {
+
+/*
+change layernorm op to fast_layernorm op


描述改一下，第一个字母大写

hong19860320 · 2023-09-14T11:00:24Z

paddle/phi/backends/xpu/xpu2_op_list.cc

@@ -70,6 +70,8 @@ XPUOpMap& get_kl2_ops() {
       XPUKernelSet({phi::DataType::FLOAT32, phi::DataType::FLOAT16})},
      {"batch_norm",
       XPUKernelSet({phi::DataType::FLOAT32, phi::DataType::FLOAT16})},
+      {"layer_norm_act_xpu",


按照字母顺序？或者 fuse 类的算子放一起？

hong19860320

LGTM

paddle-bot bot added the contributor External developers label Sep 8, 2023

frank-oops force-pushed the fln_act-1 branch from 9181ab8 to 33d3a16 Compare September 11, 2023 06:21

hong19860320 reviewed Sep 13, 2023

View reviewed changes

hong19860320 requested a review from wz1qqx September 13, 2023 01:59

MuShangCC reviewed Sep 13, 2023

View reviewed changes

frank-oops force-pushed the fln_act-1 branch from 33d3a16 to a73a054 Compare September 13, 2023 08:21

hong19860320 reviewed Sep 13, 2023

View reviewed changes

frank-oops force-pushed the fln_act-1 branch from a73a054 to f9582fd Compare September 14, 2023 09:59

hong19860320 reviewed Sep 14, 2023

View reviewed changes

frank-oops force-pushed the fln_act-1 branch 2 times, most recently from 2305a00 to cd13575 Compare September 18, 2023 06:18

add fast_layternorm and act(leaky_relu) fusion

d611343

frank-oops force-pushed the fln_act-1 branch from cd13575 to d611343 Compare September 18, 2023 07:00

hong19860320 changed the title ~~add fast_layternorm and act(leaky_relu) fusion~~ [XPU] Add fast_layternorm + leaky_relu fusion Sep 19, 2023

hong19860320 approved these changes Sep 19, 2023

View reviewed changes

hong19860320 merged commit ad32cca into PaddlePaddle:develop Sep 19, 2023

hong19860320 changed the title ~~[XPU] Add fast_layternorm + leaky_relu fusion~~ [XPU] Add fast_layernorm + leaky_relu fusion Sep 19, 2023

Frida-a pushed a commit to Frida-a/Paddle that referenced this pull request Oct 14, 2023

[XPU] Add fast_layternorm + leaky_relu fusion (PaddlePaddle#57113)

facd4d7

danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023

[XPU] Add fast_layternorm + leaky_relu fusion (PaddlePaddle#57113)

5212dd3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XPU] Add fast_layernorm + leaky_relu fusion #57113

[XPU] Add fast_layernorm + leaky_relu fusion #57113

frank-oops commented Sep 8, 2023

paddle-bot bot commented Sep 8, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

MuShangCC Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 13, 2023

hong19860320 Sep 14, 2023

hong19860320 Sep 14, 2023

hong19860320 Sep 14, 2023

hong19860320 left a comment

[XPU] Add fast_layernorm + leaky_relu fusion #57113

[XPU] Add fast_layernorm + leaky_relu fusion #57113

Conversation

frank-oops commented Sep 8, 2023

PR types

PR changes

Description

paddle-bot bot commented Sep 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hong19860320 left a comment

Choose a reason for hiding this comment