-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add soft-label support for cross-entropy operator. #4081
Add soft-label support for cross-entropy operator. #4081
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to update and fix conflicts.
paddle/operators/cross_entropy_op.cc
Outdated
auto *X = ctx.Input<Tensor>("X"); | ||
auto *label = ctx.Input<Tensor>("label"); | ||
auto *x = ctx.Input<Tensor>("X"); | ||
auto *label = ctx.Input<Tensor>("Label"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add not null check for Input(X) and Input(Label). Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
@@ -17,59 +17,71 @@ limitations under the License. */ | |||
namespace paddle { | |||
namespace operators { | |||
|
|||
class OnehotCrossEntropyOp : public framework::OperatorWithKernel { | |||
class CrossEntropyOp : public framework::OperatorWithKernel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luotao1 changes CrossEntropyOp -> OnehotCrossEntropyOp. Please use the new name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it supports not only one-hot cross-entropy but also soft-label cross-entropy, it would be better to use CrossEntropyOp instead of OnehotCrossEntropyOp.
paddle/operators/cross_entropy_op.cc
Outdated
// normal cross entropy | ||
PADDLE_ENFORCE_EQ(x->dims()[0], label->dims()[0]); | ||
} | ||
ctx.Output<Tensor>("Y")->Resize({x->dims()[0]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output<framework::LoDTensor>
Now must use Output<framework::LoDTensor>
for output in forward and backward InferShape
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qingqing01 是否需要按照新的命名规范将输出 Y 改为 Out ?(之前益群在FC中使用Y作为输出时,也被要求改为 Out ,是否需要统一?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @Xreki , we both prefer "Loss" as the output name rather than "Out". I think Loss is more meaningful than "Out".
paddle/operators/cross_entropy_op.cc
Outdated
public: | ||
using framework::OperatorWithKernel::OperatorWithKernel; | ||
|
||
protected: | ||
void InferShape(const framework::InferShapeContext &ctx) const override { | ||
auto dX = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto X = ctx.Input<Tensor>("X"); | ||
auto dx = ctx.Output<Tensor>(framework::GradVarName("X")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output< framework::LoDTensor>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
auto dX = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto X = ctx.Input<Tensor>("X"); | ||
auto dx = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto x = ctx.Input<Tensor>("X"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add not null check for Input(X). Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
Y[i] = -log(X[i][j]) | ||
The second input (Label tensor) supports two kinds of shapes: | ||
1) Rank(Label) = 1, Label[i] indicates the class index for sample i: | ||
Y[i] = -log(X[i, Label[i]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space before and after formula.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
|
||
2) Rank(Label) = 2, Label[i, j] indicates the soft label of class j | ||
for sample i: | ||
Y[i] = \sum_j{-Label[i, j] * log(X[i, j])} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space before and after formula.
paddle/operators/cross_entropy_op.cu
Outdated
@@ -21,17 +21,16 @@ namespace operators { | |||
using Tensor = framework::Tensor; | |||
|
|||
template <typename T> | |||
__host__ __device__ T clipping_log(const T x) { | |||
__host__ __device__ T tolerable_value(const T x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include paddle/platform/hostdevice.h
, then use HOSTDEVICE
.
HOSTDEVICE T tolerable_value(const T x) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question here, if this function use __host__ __device__
in the declearation, why we need to implement it again in *.cc ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果换成 HOSTDEVICE,依据HOSTDEVICE的定义:
#ifdef __CUDACC__
#define HOSTDEVICE __host__ __device__
#define HOST __host__
#else
#define HOSTDEVICE
#define HOST
#endif
确实cpu、gpu可以公用这个tolerable_value
。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯~ 明白啦~ HOSTDEVICE 在 CPU 下为空。
self.check_output() | ||
|
||
def test_check_grad(self): | ||
self.check_grad(['X'], 'Y', max_relative_error=0.05) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could tune the max_relative_error smaller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
当label变成soft时,已经不再会有离散化的操作,是否可以直接调用 Egien? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All done. Thanks.
paddle/operators/cross_entropy_op.cc
Outdated
auto *X = ctx.Input<Tensor>("X"); | ||
auto *label = ctx.Input<Tensor>("label"); | ||
auto *x = ctx.Input<Tensor>("X"); | ||
auto *label = ctx.Input<Tensor>("Label"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
// normal cross entropy | ||
PADDLE_ENFORCE_EQ(x->dims()[0], label->dims()[0]); | ||
} | ||
ctx.Output<Tensor>("Y")->Resize({x->dims()[0]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
public: | ||
using framework::OperatorWithKernel::OperatorWithKernel; | ||
|
||
protected: | ||
void InferShape(const framework::InferShapeContext &ctx) const override { | ||
auto dX = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto X = ctx.Input<Tensor>("X"); | ||
auto dx = ctx.Output<Tensor>(framework::GradVarName("X")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
auto dX = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto X = ctx.Input<Tensor>("X"); | ||
auto dx = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
auto x = ctx.Input<Tensor>("X"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
Y[i] = -log(X[i][j]) | ||
The second input (Label tensor) supports two kinds of shapes: | ||
1) Rank(Label) = 1, Label[i] indicates the class index for sample i: | ||
Y[i] = -log(X[i, Label[i]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cu
Outdated
@@ -21,17 +21,16 @@ namespace operators { | |||
using Tensor = framework::Tensor; | |||
|
|||
template <typename T> | |||
__host__ __device__ T clipping_log(const T x) { | |||
__host__ __device__ T tolerable_value(const T x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
self.check_output() | ||
|
||
def test_check_grad(self): | ||
self.check_grad(['X'], 'Y', max_relative_error=0.05) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@lcy-seso 我觉得也是可以的,如果愿意容忍同一个if的两个分支:一个走cuda代码,一个走eigen。 |
caffe2 分为两个分支,可能更关键的是两者在计算上那个更高效。暂时未知。 |
// TOOD(qingqing) define CUDA_1D_KERNEL_LOOP macro in a common file. | ||
// CUDA_1D_KERNEL_LOOP(i, N) { | ||
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < N; | ||
i += blockDim.x * gridDim.x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我的小疑问请教一下 @qingqing01
- kernel 函数里面的这个for循环,计算结果不会出错,但逻辑上感觉很怪。
- grid中总会有一些block的thread是多余出来,不会严格对齐到输入数据,这里for循环的作用相当于跳过这些没有对齐的部分。
i += blockDim.x * gridDim.x
循环变量i
只要增加一次,就会直接超过最大thread数,也就是这个kernel函数实际是并不需要循环多次,只计算输出向量的一个位置,逻辑上等价于判断i>= batch_size
时,直接return,写成循环有什么考虑呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果针对下面设置grid, threadd的方式,确实不需要for循环。但如果将下面的grid的设置为一个固定的数,就是总共发起固定数目的总线程数,for循环就是有用的,有可能一个线程计算多个输出。这样这个kernel已经处理了边界,就不需要修改了。
int block = 512;
int grid = (n + block - 1) / block;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
明白啦~ 确实,cross entropy 这个 kernel 比较简单也比较特殊。grid 数目也已经计算好。
paddle/operators/cross_entropy_op.cc
Outdated
auto *label = ctx.Input<Tensor>("Label"); | ||
|
||
PADDLE_ENFORCE_EQ(x->dims().size(), 2, "X's rank must be 2."); | ||
PADDLE_ASSERT(label->dims().size() == 1 || label->dims().size() == 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in this morning, we should also use the label with rank 2 for the int label. Please help to modify it. And if so, there is no way to determine whether it is normal cross entropy or soft cross entropy by rank. Can we change to use attr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cu
Outdated
using Tensor = framework::Tensor; | ||
|
||
template <typename T> | ||
HOSTDEVICE T tolerable_value(const T x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @lcy-seso said, both CPU and GPU kernel can use this common function, if we use HOSTDEVICE
. Use this function to replace it in paddle/operators/cross_entropy_op.h
file and delete this function in paddle/operators/cross_entropy_op.cc
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cu
Outdated
sum += label[i * D + j] * log(X[i * D + j]); | ||
} | ||
Y[i] = -tolerable_value(sum); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put tolerable_value
aftern log
:
for (int j = 0; j < D; j++) {
sum += - label[i * D + j] * tolerable_value(log(X[i * D + j]));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.cc
Outdated
CrossEntropy Operator. | ||
|
||
The second input (Label tensor) supports two kinds of shapes: | ||
1) Rank(Label) = 1, Label[i] indicates the class index for sample i: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If modify the int label rank, this doc comments also are needed to modify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < N; | ||
i += blockDim.x * gridDim.x) { | ||
T sum = static_cast<T>(0); | ||
for (int j = 0; j < D; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add todo
optimization for this kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cross_entropy_op.h
Outdated
T sum = static_cast<T>(0); | ||
for (int j = 0; j < class_num; ++j) { | ||
sum += label_data[index] * std::log(x_data[index]); | ||
y_data[i] = -tolerable_value(sum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tolerable_value
before std::log
.
sum += - label_data[index] * tolerable_value(std::log(x_data[index]));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Merge this PR, if there is any question, we will fix later. |
Resolve #4080
Resolve #3898