Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clip op #3937

Merged
merged 16 commits into from
Sep 21, 2017
Merged

Add clip op #3937

merged 16 commits into from
Sep 21, 2017

Conversation

wanghaoshuang
Copy link
Contributor

fix #3910

template <typename T>
__global__ void ClipGradientKernel(const int N, const T min, const T max,
const T* Y, const T* dY, T* dX) {
CUDA_1D_KERNEL_LOOP(i, N) { dX[i] = dY[i] * (Y[i] > min && Y[i] < max); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation depends on true is 1 in C++ and CUDA. It is not good.

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done. It seems that Relu is an alias of Clipping Operator with min=0.0, max=+inf. To alias an operator, please reference identity_op

@qingqing01 qingqing01 requested a review from Superjomn September 9, 2017 02:22
AddInput("X", "The input of clip op");
AddOutput("Out", "The output of clip op");
AddComment(R"DOC(
Clip Operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Clip Operator.
)DOC");
AddAttr<float>("min", "min value to be clipped.");
AddAttr<float>("max", "max value to be clipped.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddAttr放在DOC前面吧,min、max也需要类型模板,参考: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L36

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一个op的多个attrs的类型必须一样么?

void InferShape(const framework::InferShapeContext &ctx) const override {
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X"), "Input(X) should not be null");
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar(framework::GradVarName("Out")),
"Input(Out@GRAD) should not be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is also needed in ClipOp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

auto max = Attr<float>("max");
auto min = Attr<float>("min");
PADDLE_ENFORCE_LT(min, max, "max should be greater than min.");
ctx.Output<Tensor>("Out")->Resize(x_dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output< framework::LoDTensor>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

PADDLE_ENFORCE_NOT_NULL(ctx.InputVar(framework::GradVarName("Out")),
"Input(Out@GRAD) should not be null");
auto x_dims = ctx.Input<Tensor>("X")->dims();
auto *x_grad = ctx.Output<Tensor>(framework::GradVarName("X"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output< framework::LoDTensor>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

public:
void Compute(const framework::ExecutionContext& context) const override {
auto max = context.op().Attr<float>("max");
auto min = context.op().Attr<float>("min");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.op().Attr<> -> context.Attr<>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

size_t count = 1;
for (int i = 0; i < dims.size(); ++i) {
count *= dims[i];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int64_t count = dims->numel()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

int block = 512;
int grid = (N * D + block - 1) / block;

ClipGradientKernel<T><<<grid, block>>>(count, min, max, x_data, d_out_data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use CUDA stream when launching kernel. Please refer to im2col.cu in paddle/operator/math.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

size_t count = 1;
for (int i = 0; i < dims.size(); ++i) {
count *= dims[i];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

import numpy as np
from paddle.v2.framework.op import Operator
from gradient_checker import GradientChecker
from op_test_util import OpTestMeta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the new unit testing framework.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

auto x_dims = ctx.Input<LoDTensor>("X")->dims();
auto *x_grad = ctx.Output<LoDTensor>(framework::GradVarName("X"));

x_grad->Resize(x_dims);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check whether x_grad is nullptr before it is resized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (Y[i] > min && Y[i] < max) {
dX[i] = dY[i];
} else {
dX[i] = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clip to 0 or min / max ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the gradient computation. Please refer ClipGradientOp of caffe2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

auto* x = context.Input<LoDTensor>("X");
auto dims = d_x->dims();
int64_t count = d_out->numel();
auto d_x_data = d_x->mutable_data<T>(context.GetPlace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just d_x->mutable_data<T>(context.GetPlace()) and keep using d_x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d_x_data is needed by ClipGradientKernel

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean d_x_data and d_x they are totally same. They point to the same memory.

int D = d_x->dims()[1];
int block = 512;
int grid = (N * D + block - 1) / block;
ClipGradientKernel<T><<<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implement is too complex. @reyoung provids a global function transform to apply a functor to every elements in an iterable object.
See https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/transform.h
It works on both CPU and GPU.

Here is a demo of how to use it: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/prelu_op.h#L57

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

auto x_tensor = EigenTensor<T, D>::From(*x);
auto out_tensor = EigenTensor<T, D>::From(*out);
auto place = context.GetEigenDevice<Place>();
out_tensor.device(place) = x_tensor.cwiseMin(max).cwiseMax(min);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transform() can also be used here. With it, there is no necessity to use Eigen and we can get rid of template parameter size_t D.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (y > min_ && y < max_)
return x;
else
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return y > min_ && y < max_ ? x : 0;

JiayiFeng
JiayiFeng previously approved these changes Sep 21, 2017
Copy link
Collaborator

@JiayiFeng JiayiFeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

auto* out = context.Output<Tensor>("Out");
T* out_data = out->mutable_data<T>(context.GetPlace());
const T* x_data = x->data<T>();
int numel = x->numel();
Copy link
Collaborator

@JiayiFeng JiayiFeng Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int64_t

if (d_x != nullptr) {
auto* x = context.Input<Tensor>("X");
int64_t numel = d_out->numel();
auto d_x_data = d_x->mutable_data<T>(context.GetPlace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d_x_data is a pointer, so please use auto*

@wanghaoshuang wanghaoshuang merged commit e53dc8a into PaddlePaddle:develop Sep 21, 2017
@wanghaoshuang wanghaoshuang deleted the clip_op branch May 20, 2022 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clip Operator.
4 participants