-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Softmax grad op #3164
Softmax grad op #3164
Conversation
paddle/operators/softmax_op.cc
Outdated
PADDLE_ENFORCE(ctx.InputVar(1) != nullptr, "Input(1) should not be null"); | ||
PADDLE_ENFORCE(ctx.Input<Tensor>(0)->dims() == ctx.Input<Tensor>(1)->dims(), | ||
"the shape of Input(0) and Input(1) should be the same"); | ||
ctx.Output<Tensor>(0)->ResizeLike(*ctx.Input<Tensor>(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think ResizeLike
method is needed. Because it does not simply our implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
paddle/operators/softmax_op.h
Outdated
for (int i = 0; i < batch_size; ++i) { | ||
for (int j = 0; j < class_num; ++j) { | ||
auto index = i * batch_size + j; | ||
scale_->data<T>()[i] += Y->data<T>()[index] * dY->data<T>()[index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code obviously cannot be run on GPU.
@@ -19,5 +19,30 @@ def setUp(self): | |||
self.Y = np.apply_along_axis(stable_softmax, 1, self.X) | |||
|
|||
|
|||
class TestSoftmaxGradOp(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Gradient of Operator does not need unit test like this.
… into softmax_grad_op
paddle/operators/softmax_op.h
Outdated
Y->data<T>()[index] * (dY->data<T>()[index] - scale_->data<T>()[i]); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for CPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now using Eigen to rewrite this compute function and will support GPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emailweixu support GPU and add python unit test done
paddle/operators/softmax_op.h
Outdated
auto dX_eigen = EigenMatrix<T>::From(*dX); | ||
auto place = context.GetEigenDevice<Place>(); | ||
|
||
dX_eigen.device(place) = dY_eigen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto dot = (Y_eigen * dY_eigen)
.sum(along_class)
.eval()
.reshape(batch_by_one)
.broadcast(one_by_class);
dX_eigen.device(place) = (dY_eigen - dot) * Y_eigen;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.device
will evaluate the expression so it's better to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except some tiny problems
paddle/operators/softmax_op.cc
Outdated
LOG(INFO) << "SoftmaxOpGrad"; | ||
return ""; | ||
void InferShape(const InferShapeContext &ctx) const override { | ||
PADDLE_ENFORCE(ctx.InputSize() == 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change ctx.InputSize() == 3Ul, to prevent warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/softmax_op.cc
Outdated
"Input of SoftmaxOpGrad should be 3, X, Y, YG"); | ||
PADDLE_ENFORCE(ctx.OutputSize() == 1, | ||
"Output of SoftmaxOpGrad should be 1"); | ||
PADDLE_ENFORCE(ctx.InputVar("Y") != nullptr, "Input(0) should not be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Input("Y") should not be null"
fix #3148