Feat: AddBiasResidualLayerNorm #9906

zobinHuang · 2023-02-26T17:43:12Z

liujuncheng

宏用的优点多，有时候函数或者lambda应该是更好的选择

liujuncheng · 2023-03-16T07:49:35Z

oneflow/core/functional/impl/nn_functor.cpp

+            if (nb_skip >= 1) { new_op = new_op.Input("skip"); }
+            new_op = new_op.Output("y").Output("mean").Output("inv_variance");
+
+            std::shared_ptr<OpExpr> op_pointer = CHECK_JUST(new_op.Build());


Suggested change

std::shared_ptr<OpExpr> op_pointer = CHECK_JUST(new_op.Build());

std::shared_ptr<OpExpr> op_expr = CHECK_JUST(new_op.Build());

除非类型去要特别强调或者避免歧义的情况下，否则一般不把类型作为命名的一部分，因为这部分的语音是冗余的

liujuncheng · 2023-03-16T07:51:16Z

oneflow/core/functional/impl/nn_functor.cpp

+            if (has_gamma) { new_op = new_op.Input("gamma"); }
+            if (has_beta) { new_op = new_op.Input("beta"); }
+            if (has_bias) { new_op = new_op.Input("bias"); }
+            if (nb_skip >= 1) { new_op = new_op.Input("skip"); }


把

for (nb_skip = 0; nb_skip <= 1; nb_skip++)

改成 {false, true} 吧，否则这里还需要更特殊的处理才正确

liujuncheng · 2023-03-16T07:51:48Z

oneflow/core/functional/impl/nn_functor.cpp

+        for (bool has_beta : bool_list) {
+          /* has_bias */
+          for (bool has_bias : bool_list) {
+            one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");


Suggested change

one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");

one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");

new_op 应该命名为 op_builder 或者 builder

liujuncheng · 2023-03-16T07:53:40Z

oneflow/core/functional/impl/nn_functor.cpp

+                           const double& epsilon, const double& alpha) const {
+    // check shape of x
+    const auto& x_shape = *(x->shape());
+    CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)


这个检查可以考虑换成 GE(2)

liujuncheng · 2023-03-16T07:55:10Z

oneflow/core/functional/impl/nn_functor.cpp

+    CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)
+        << "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();
+
+#define GAMMA_BETA_BIAS_SHAPE_CHECK(tensor)                                            \


感觉这里用lambda更合适

liujuncheng · 2023-03-16T08:06:45Z

oneflow/user/ops/skip_layer_norm_op.cpp

+  // set output shape of mean and varience
+  DimVector mean_dim_vec;
+  mean_dim_vec.push_back(x_shape.Count(0, x_shape.NumAxes() - 1));
+  Shape mean_shape(mean_dim_vec);  // borrow from input shape


Suggested change

Shape mean_shape(mean_dim_vec); // borrow from input shape

Shape mean_shape(mean_dim_vec);

如果有注释要注意正确性

liujuncheng · 2023-03-16T08:07:11Z

oneflow/user/ops/skip_layer_norm_op.cpp

+        << "data type of \'gamma\' is not consitant with \'x\'";
+  }
+
+  // check data type of pre_bias


Suggested change

// check data type of pre_bias

// check data type of bias

liujuncheng · 2023-03-16T08:07:23Z

oneflow/user/ops/skip_layer_norm_op.cpp

+        << "data type of \'beta\' is not consitant with \'x\'";
+  }
+
+  // check data types of pre_residual_1 and pre_residual_2


Suggested change

// check data types of pre_residual_1 and pre_residual_2

// check data types of skip

liujuncheng · 2023-03-16T08:10:57Z

oneflow/user/kernels/skip_layer_norm_kernel.cu

+template<typename SRC, typename DST>
+struct SkipLoad {
+  using LoadType = DST;
+  SkipLoad(const SRC* src, const SRC* bias, const SRC* skip, double alpha, int64_t row_size)


这里alpha用float而不是double，因为我们不希望出现double的计算

liujuncheng · 2023-03-16T08:11:33Z

oneflow/user/kernels/skip_layer_norm_kernel.cu

+    // obtain epsilon and check its value
+    const double epsilon = ctx->Attr<double>("epsilon");
+    const double alpha = ctx->Attr<double>("alpha");
+    CHECK_GE(epsilon, CUDNN_BN_MIN_EPSILON);


这个没有必要了

liujuncheng · 2023-03-21T02:41:06Z

oneflow/core/functional/impl/nn_functor.cpp

+    // check shape of x
+    const auto& x_shape = *(x->shape());
+    CHECK_GE_OR_RETURN(x_shape.NumAxes(), 2)
+        << "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();


这里错误信息同步修改一下

liujuncheng · 2023-03-21T02:42:39Z

oneflow/core/functional/impl/nn_functor.cpp

+          << "number of axes of \'gamma\' should have be equal to 1, yet get "
+          << gamma_shape.NumAxes();
+      CHECK_EQ_OR_RETURN(gamma_shape.At(0), x_shape.At(x_shape.NumAxes() - 1))
+          << "dimension 1 of \'gamma\'(" << gamma_shape.At(0)


这里的dimension 1可能会因为从0还是1数有歧义，考虑换个写法

liujuncheng · 2023-03-21T02:45:16Z

oneflow/core/functional/impl/nn_functor.cpp

+    }
+
+    bool has_gamma = false, has_beta = false, has_bias = false;
+    if (gamma) { has_gamma = true; }


has_skip和has_gamma的写法是否可以一致

liujuncheng · 2023-03-21T02:46:22Z

oneflow/user/ops/skip_layer_norm_op.cpp

+  CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)
+      << "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();
+
+#define GAMMA_BETA_BIAS_SHAPE_CHECK(tensor)                                                \


这里还是用的宏

liujuncheng · 2023-03-21T02:47:18Z

oneflow/user/kernels/skip_layer_norm_kernel.cu

+    CHECK_GT(x_shape.NumAxes(), 1)
+        << "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();
+
+#define GET_GAMMA_BETA_BIAS_AND_SHAPE_CHECK(tensor)                                      \


这些地方可以不用宏么

github-actions · 2023-03-21T18:04:13Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 140.9ms (= 14090.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 142.4ms (= 14235.9ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.01 (= 142.4ms / 140.9ms)

OneFlow resnet50 time: 80.5ms (= 8051.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.8ms (= 8280.6ms / 100, input_shape=[8, 3, 224, 224])
❌ Relative speed: 1.03 (= 82.8ms / 80.5ms)

OneFlow resnet50 time: 48.5ms (= 9705.5ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.6ms (= 11511.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.19 (= 57.6ms / 48.5ms)

OneFlow resnet50 time: 32.3ms (= 6468.6ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.0ms (= 9399.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.45 (= 47.0ms / 32.3ms)

OneFlow resnet50 time: 26.2ms (= 5236.2ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.1ms (= 8413.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.61 (= 42.1ms / 26.2ms)

OneFlow swin dataloader time: 0.238s (= 47.552s / 200, num_workers=1)
PyTorch swin dataloader time: 0.149s (= 29.723s / 200, num_workers=1)
Relative speed: 0.625 (= 0.149s / 0.238s)

OneFlow swin dataloader time: 0.072s (= 14.489s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.357s / 200, num_workers=4)
Relative speed: 0.577 (= 0.042s / 0.072s)

OneFlow swin dataloader time: 0.043s (= 8.553s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.407s / 200, num_workers=8)
Relative speed: 0.515 (= 0.022s / 0.043s)

❌ OneFlow resnet50 time: 152.3ms (= 15231.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.8ms (= 16279.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 162.8ms / 152.3ms)

OneFlow resnet50 time: 91.3ms (= 9128.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.9ms (= 10189.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 101.9ms / 91.3ms)

OneFlow resnet50 time: 59.1ms (= 11825.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15638.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 78.2ms / 59.1ms)

OneFlow resnet50 time: 41.4ms (= 8275.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.5ms (= 15095.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 75.5ms / 41.4ms)

OneFlow resnet50 time: 38.6ms (= 7713.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 63.8ms (= 12756.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.65 (= 63.8ms / 38.6ms)

github-actions · 2023-03-21T18:09:00Z

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions · 2023-03-22T02:24:58Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.2ms (= 14120.1ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.0ms (= 14496.4ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.03 (= 145.0ms / 141.2ms)

OneFlow resnet50 time: 81.3ms (= 8130.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.4ms (= 8644.7ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 86.4ms / 81.3ms)

OneFlow resnet50 time: 50.6ms (= 10114.7ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 64.3ms (= 12856.6ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.27 (= 64.3ms / 50.6ms)

OneFlow resnet50 time: 34.1ms (= 6823.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 43.5ms (= 8694.9ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.27 (= 43.5ms / 34.1ms)

OneFlow resnet50 time: 26.1ms (= 5217.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.6ms (= 7513.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.44 (= 37.6ms / 26.1ms)

OneFlow swin dataloader time: 0.234s (= 46.883s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.158s / 200, num_workers=1)
Relative speed: 0.643 (= 0.151s / 0.234s)

OneFlow swin dataloader time: 0.070s (= 13.970s / 200, num_workers=4)
PyTorch swin dataloader time: 0.043s (= 8.663s / 200, num_workers=4)
Relative speed: 0.620 (= 0.043s / 0.070s)

OneFlow swin dataloader time: 0.042s (= 8.488s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.355s / 200, num_workers=8)
Relative speed: 0.513 (= 0.022s / 0.042s)

❌ OneFlow resnet50 time: 152.7ms (= 15269.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.9ms (= 16385.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 163.9ms / 152.7ms)

OneFlow resnet50 time: 92.6ms (= 9264.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.5ms (= 10345.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 103.5ms / 92.6ms)

OneFlow resnet50 time: 60.1ms (= 12024.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15858.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 79.3ms / 60.1ms)

OneFlow resnet50 time: 42.7ms (= 8530.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.0ms (= 14409.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.69 (= 72.0ms / 42.7ms)

OneFlow resnet50 time: 37.9ms (= 7581.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.9ms (= 13788.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 68.9ms / 37.9ms)

github-actions · 2023-03-22T02:36:45Z

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9906/

initial commit, finish op dev, kernel not start yet

d5cdb9b

zobinHuang requested review from hjchen2, BBuf and jackalcooper as code owners February 26, 2023 17:43

still developing kernel, just backup

e46d437

zobinHuang requested a review from liujuncheng as a code owner February 27, 2023 17:26

zobinHuang added 2 commits March 2, 2023 01:59

finish op, remain kernel not implement

e56874c

finish op dev, start writing test script

ce4081d

zobinHuang requested a review from daquexian as a code owner March 10, 2023 05:45

zobinHuang added 5 commits March 12, 2023 15:39

finish development

1202aab

fix more

92e8063

finish test script

e09d58c

finish dev

b6ce590

fix format

d2a6fb7

zobinHuang added op feature labels Mar 15, 2023

fix format

397309d

liujuncheng reviewed Mar 16, 2023

View reviewed changes

zobinHuang added 2 commits March 20, 2023 20:08

fix CR comments

58eafa7

fix comments

ae09922

liujuncheng reviewed Mar 21, 2023

View reviewed changes

fix comments

4042e85

liujuncheng approved these changes Mar 21, 2023

View reviewed changes

ofhwei approved these changes Mar 21, 2023

View reviewed changes

Merge branch 'master' into add_bias_residual_layernorm

3916e57

liujuncheng added the automerge label Mar 21, 2023

liujuncheng requested a review from oneflow-ci-bot March 21, 2023 11:25

Merge branch 'master' into add_bias_residual_layernorm

8a067d7

github-actions bot removed the automerge label Mar 21, 2023

Merge branch 'master' into add_bias_residual_layernorm

0b3b9a1

liujuncheng merged commit 0bd58ba into master Mar 22, 2023

liujuncheng deleted the add_bias_residual_layernorm branch March 22, 2023 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: AddBiasResidualLayerNorm #9906

Feat: AddBiasResidualLayerNorm #9906

zobinHuang commented Feb 26, 2023

liujuncheng left a comment

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 16, 2023

liujuncheng Mar 21, 2023

liujuncheng Mar 21, 2023

liujuncheng Mar 21, 2023

liujuncheng Mar 21, 2023

liujuncheng Mar 21, 2023

github-actions bot commented Mar 21, 2023

github-actions bot commented Mar 21, 2023

github-actions bot commented Mar 22, 2023

github-actions bot commented Mar 22, 2023

	std::shared_ptr<OpExpr> op_pointer = CHECK_JUST(new_op.Build());
	std::shared_ptr<OpExpr> op_expr = CHECK_JUST(new_op.Build());

	one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");
	one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");

	Shape mean_shape(mean_dim_vec); // borrow from input shape
	Shape mean_shape(mean_dim_vec);

	// check data types of pre_residual_1 and pre_residual_2
	// check data types of skip

Feat: AddBiasResidualLayerNorm #9906

Feat: AddBiasResidualLayerNorm #9906

Conversation

zobinHuang commented Feb 26, 2023

liujuncheng left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 21, 2023

github-actions bot commented Mar 21, 2023

github-actions bot commented Mar 22, 2023

github-actions bot commented Mar 22, 2023