Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: AddBiasResidualLayerNorm #9906

Merged
merged 16 commits into from
Mar 22, 2023
Merged

Conversation

zobinHuang
Copy link
Contributor

add_bias_residual_laynorm drawio

@zobinHuang zobinHuang requested a review from daquexian as a code owner March 10, 2023 05:45
Copy link
Collaborator

@liujuncheng liujuncheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

宏用的优点多,有时候函数或者lambda应该是更好的选择

if (nb_skip >= 1) { new_op = new_op.Input("skip"); }
new_op = new_op.Output("y").Output("mean").Output("inv_variance");

std::shared_ptr<OpExpr> op_pointer = CHECK_JUST(new_op.Build());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::shared_ptr<OpExpr> op_pointer = CHECK_JUST(new_op.Build());
std::shared_ptr<OpExpr> op_expr = CHECK_JUST(new_op.Build());

除非类型去要特别强调或者避免歧义的情况下,否则一般不把类型作为命名的一部分,因为这部分的语音是冗余的

if (has_gamma) { new_op = new_op.Input("gamma"); }
if (has_beta) { new_op = new_op.Input("beta"); }
if (has_bias) { new_op = new_op.Input("bias"); }
if (nb_skip >= 1) { new_op = new_op.Input("skip"); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (nb_skip = 0; nb_skip <= 1; nb_skip++)

改成 {false, true} 吧,否则这里还需要更特殊的处理才正确

for (bool has_beta : bool_list) {
/* has_bias */
for (bool has_bias : bool_list) {
one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");
one::OpBuilder new_op = one::OpBuilder("skip_layer_norm").Input("x");

new_op 应该命名为 op_builder 或者 builder

const double& epsilon, const double& alpha) const {
// check shape of x
const auto& x_shape = *(x->shape());
CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个检查可以考虑换成 GE(2)

CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)
<< "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();

#define GAMMA_BETA_BIAS_SHAPE_CHECK(tensor) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这里用lambda更合适

// set output shape of mean and varience
DimVector mean_dim_vec;
mean_dim_vec.push_back(x_shape.Count(0, x_shape.NumAxes() - 1));
Shape mean_shape(mean_dim_vec); // borrow from input shape
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Shape mean_shape(mean_dim_vec); // borrow from input shape
Shape mean_shape(mean_dim_vec);

如果有注释要注意正确性

<< "data type of \'gamma\' is not consitant with \'x\'";
}

// check data type of pre_bias
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// check data type of pre_bias
// check data type of bias

<< "data type of \'beta\' is not consitant with \'x\'";
}

// check data types of pre_residual_1 and pre_residual_2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// check data types of pre_residual_1 and pre_residual_2
// check data types of skip

template<typename SRC, typename DST>
struct SkipLoad {
using LoadType = DST;
SkipLoad(const SRC* src, const SRC* bias, const SRC* skip, double alpha, int64_t row_size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里alpha用float而不是double,因为我们不希望出现double的计算

// obtain epsilon and check its value
const double epsilon = ctx->Attr<double>("epsilon");
const double alpha = ctx->Attr<double>("alpha");
CHECK_GE(epsilon, CUDNN_BN_MIN_EPSILON);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个没有必要了

// check shape of x
const auto& x_shape = *(x->shape());
CHECK_GE_OR_RETURN(x_shape.NumAxes(), 2)
<< "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里错误信息同步修改一下

<< "number of axes of \'gamma\' should have be equal to 1, yet get "
<< gamma_shape.NumAxes();
CHECK_EQ_OR_RETURN(gamma_shape.At(0), x_shape.At(x_shape.NumAxes() - 1))
<< "dimension 1 of \'gamma\'(" << gamma_shape.At(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的dimension 1可能会因为从0还是1数有歧义,考虑换个写法

}

bool has_gamma = false, has_beta = false, has_bias = false;
if (gamma) { has_gamma = true; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_skip和has_gamma的写法是否可以一致

CHECK_GT_OR_RETURN(x_shape.NumAxes(), 1)
<< "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();

#define GAMMA_BETA_BIAS_SHAPE_CHECK(tensor) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里还是用的宏

CHECK_GT(x_shape.NumAxes(), 1)
<< "number of axes of \'x\' should have be greater than 1, yet get " << x_shape.NumAxes();

#define GET_GAMMA_BETA_BIAS_AND_SHAPE_CHECK(tensor) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些地方可以不用宏么

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 140.9ms (= 14090.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 142.4ms (= 14235.9ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.01 (= 142.4ms / 140.9ms)

OneFlow resnet50 time: 80.5ms (= 8051.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.8ms (= 8280.6ms / 100, input_shape=[8, 3, 224, 224])
❌ Relative speed: 1.03 (= 82.8ms / 80.5ms)

OneFlow resnet50 time: 48.5ms (= 9705.5ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.6ms (= 11511.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.19 (= 57.6ms / 48.5ms)

OneFlow resnet50 time: 32.3ms (= 6468.6ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.0ms (= 9399.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.45 (= 47.0ms / 32.3ms)

OneFlow resnet50 time: 26.2ms (= 5236.2ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.1ms (= 8413.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.61 (= 42.1ms / 26.2ms)

OneFlow swin dataloader time: 0.238s (= 47.552s / 200, num_workers=1)
PyTorch swin dataloader time: 0.149s (= 29.723s / 200, num_workers=1)
Relative speed: 0.625 (= 0.149s / 0.238s)

OneFlow swin dataloader time: 0.072s (= 14.489s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.357s / 200, num_workers=4)
Relative speed: 0.577 (= 0.042s / 0.072s)

OneFlow swin dataloader time: 0.043s (= 8.553s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.407s / 200, num_workers=8)
Relative speed: 0.515 (= 0.022s / 0.043s)

❌ OneFlow resnet50 time: 152.3ms (= 15231.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.8ms (= 16279.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 162.8ms / 152.3ms)

OneFlow resnet50 time: 91.3ms (= 9128.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.9ms (= 10189.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 101.9ms / 91.3ms)

OneFlow resnet50 time: 59.1ms (= 11825.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15638.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 78.2ms / 59.1ms)

OneFlow resnet50 time: 41.4ms (= 8275.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.5ms (= 15095.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 75.5ms / 41.4ms)

OneFlow resnet50 time: 38.6ms (= 7713.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 63.8ms (= 12756.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.65 (= 63.8ms / 38.6ms)

@github-actions
Copy link
Contributor

CI failed when running job: cuda-module. PR label automerge has been removed

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.2ms (= 14120.1ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.0ms (= 14496.4ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.03 (= 145.0ms / 141.2ms)

OneFlow resnet50 time: 81.3ms (= 8130.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.4ms (= 8644.7ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 86.4ms / 81.3ms)

OneFlow resnet50 time: 50.6ms (= 10114.7ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 64.3ms (= 12856.6ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.27 (= 64.3ms / 50.6ms)

OneFlow resnet50 time: 34.1ms (= 6823.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 43.5ms (= 8694.9ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.27 (= 43.5ms / 34.1ms)

OneFlow resnet50 time: 26.1ms (= 5217.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.6ms (= 7513.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.44 (= 37.6ms / 26.1ms)

OneFlow swin dataloader time: 0.234s (= 46.883s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.158s / 200, num_workers=1)
Relative speed: 0.643 (= 0.151s / 0.234s)

OneFlow swin dataloader time: 0.070s (= 13.970s / 200, num_workers=4)
PyTorch swin dataloader time: 0.043s (= 8.663s / 200, num_workers=4)
Relative speed: 0.620 (= 0.043s / 0.070s)

OneFlow swin dataloader time: 0.042s (= 8.488s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.355s / 200, num_workers=8)
Relative speed: 0.513 (= 0.022s / 0.042s)

❌ OneFlow resnet50 time: 152.7ms (= 15269.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.9ms (= 16385.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 163.9ms / 152.7ms)

OneFlow resnet50 time: 92.6ms (= 9264.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.5ms (= 10345.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 103.5ms / 92.6ms)

OneFlow resnet50 time: 60.1ms (= 12024.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15858.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 79.3ms / 60.1ms)

OneFlow resnet50 time: 42.7ms (= 8530.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.0ms (= 14409.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.69 (= 72.0ms / 42.7ms)

OneFlow resnet50 time: 37.9ms (= 7581.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.9ms (= 13788.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 68.9ms / 37.9ms)

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9906/

@liujuncheng liujuncheng merged commit 0bd58ba into master Mar 22, 2023
@liujuncheng liujuncheng deleted the add_bias_residual_layernorm branch March 22, 2023 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants