-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add maxout operator. #5571
Add maxout operator. #5571
Conversation
paddle/operators/math/maxouting.cc
Outdated
maxout_process.compute(ele, | ||
input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]); | ||
} | ||
maxout_process.finalize(ele, (static_cast<T>(groups))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the MaxOutProcess maxout_process
is not necessary. The implementation can be expanded here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
前两个,我还是用了,把finalize去掉了,的确没有用
paddle/operators/math/maxouting.cc
Outdated
class MaxOutFunctor<platform::CPUPlace, MaxOutProcess, T> { | ||
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
framework::Tensor*
for output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the num_channels
can be got from the input tensor. There is no need to pass this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
const int batch_size = input.dims()[0]; | ||
const int input_height = input.dims()[2]; | ||
const int input_width = input.dims()[3]; | ||
const int output_channels = num_channels/groups; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output_channels
can be got from the output tensor. Then the groups
can be calculated from channel number of input and output.
const int output_channels = output.dims()[1];
const int group = input.dims()[1] / output_channels;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groups是最开始传进来的,然后output是用这个算出来的,所以这个groups我就一直带着了
paddle/operators/math/maxouting.cc
Outdated
const int output_channels = num_channels/groups; | ||
|
||
int fea_size = input_height * input_width; | ||
int c_size = fea_size * output_channels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_size -> out_size ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
|
||
#pragma once | ||
|
||
#include "paddle/framework/eigen.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this unused header file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
|
||
#include "paddle/framework/eigen.h" | ||
#include "paddle/framework/op_registry.h" | ||
#include "paddle/operators/math/math_function.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this header file is also not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include "paddle/framework/op_registry.h"这个也是没用的吗?不是注册的时候用的吗
|
||
|
||
|
||
class TestMaxOut_Op(OpTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestMaxOut_Op -> TestMaxOutOp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
|
||
|
||
def maxout_forward_naive_2sweetsky(input, groups, num_channels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is not used, please to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
def test_check_grad(self): | ||
print self.inputs | ||
print self.outputs | ||
self.check_grad(['X'], 'Out', max_relative_error=0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaxOut Op的梯度检测应该不稳定,+- delta之后,max值可能改变,除非特殊构造输入,可以保持+- delta之后的max值不变。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reorder parameter.
Function_Parameter_Ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个顺序,把num_channel去掉了,其他的位置暂时没有调整,整理来说前面是input 后面是output,但是groups属于一个全属性
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better that putting all input-only parameters before any output parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
更里面的调了,这个暂时不调了,不然要改的地方也很多,收益并不大。这里不是严格按照输入输出看的,input和output更针对矩阵
paddle/operators/math/maxouting.cc
Outdated
framework::Tensor& input_grad, | ||
const framework::Tensor& output, | ||
const framework::Tensor& output_grad, | ||
int groups, int num_channels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Function_Parameter_Ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个顺序,把num_channel去掉了,其他的位置暂时没有调整,整理来说前面是input 后面是output,但是groups属于一个全属性
paddle/operators/math/maxouting.cc
Outdated
input_grad_data[input_idx] += output_grad_data[output_idx]; | ||
stop = true; | ||
} else { | ||
input_grad_data[input_idx] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove else{...}
. You have set value in line 94.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
template class MaxOutFunctor<platform::CPUPlace, | ||
paddle::operators::math::MaxOut<float>, float>; | ||
template class MaxOutFunctor<platform::CPUPlace, | ||
paddle::operators::math::MaxOut<double>, double>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle::operators::
can be removed. Because MaxOutFunctor
is in the paddle::operators
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
const int input_height, const int input_width, | ||
int groups, MaxOutProcess maxout_process) { | ||
int size = input_height * input_width * channels / groups; | ||
int featLen = input_height * input_width; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size
and featLen
should be const int
.
Please remain variable name convenient. featLen
-> feat_len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; | ||
index += blockDim.x * gridDim.x) { | ||
int batch_idx = index / size; | ||
int i = index % size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use i
in there, especially in the loop.
You can use temp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
int data_idx = | ||
(batch_idx * size + channel_idx * featLen) * groups + feat_idx; | ||
T ele = maxout_process.initial(); | ||
for (int g = 0; g < groups; g++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ++g
is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other parts of code also have the similar using, please correct one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; | ||
index += blockDim.x * gridDim.x) { | ||
int batch_idx = index / size; | ||
int i = index % size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.h
Outdated
namespace math { | ||
|
||
#define FLT_MAX \ | ||
__FLT_MAX__ // It might need to be placed in another file, but I'm still |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define FLT_MAX
is alse defined in pooling.h
. And they are in the same namespace(paddle::operators::math).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我企图去掉,但是编译的时候报错,我又加上了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
Please read this document first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks
paddle/operators/math/maxouting.cc
Outdated
* All tensors are in NCHW format. | ||
* Ksize, strides, paddings are two elements. These two elements represent | ||
* height and width, respectively. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
class MaxOutFunctor<platform::CPUPlace, MaxOutProcess, T> { | ||
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个顺序,把num_channel去掉了,其他的位置暂时没有调整,整理来说前面是input 后面是output,但是groups属于一个全属性
paddle/operators/math/maxouting.cc
Outdated
const int batch_size = input.dims()[0]; | ||
const int input_height = input.dims()[2]; | ||
const int input_width = input.dims()[3]; | ||
const int output_channels = num_channels/groups; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groups是最开始传进来的,然后output是用这个算出来的,所以这个groups我就一直带着了
paddle/operators/maxout_op.h
Outdated
|
||
#pragma once | ||
|
||
#include "paddle/framework/eigen.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
|
||
#include "paddle/framework/eigen.h" | ||
#include "paddle/framework/op_registry.h" | ||
#include "paddle/operators/math/math_function.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include "paddle/framework/op_registry.h"这个也是没用的吗?不是注册的时候用的吗
|
||
|
||
|
||
def maxout_forward_naive_2sweetsky(input, groups, num_channels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
|
||
|
||
class TestMaxOut_Op(OpTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
def test_check_grad(self): | ||
print self.inputs | ||
print self.outputs | ||
self.check_grad(['X'], 'Out', max_relative_error=0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
int data_idx = | ||
(batch_idx * size + channel_idx * featLen) * groups + feat_idx; | ||
T ele = maxout_process.initial(); | ||
for (int g = 0; g < groups; g++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other parts of code also have the similar using, please correct one by one.
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better that putting all input-only parameters before any output parameters.
paddle/operators/math/maxouting.cu
Outdated
__global__ void KernelMaxOut(const int nthreads, const T* input_data, | ||
T* output_data, const int channels, | ||
const int input_height, const int input_width, | ||
int groups, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note the order of parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
def test_check_grad(self): | ||
print self.inputs | ||
print self.outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove debug code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
in_x_grad->mutable_data<T>(context.GetPlace()); | ||
auto temp = framework::EigenVector<T>::Flatten(*in_x_grad); | ||
temp.device(context.GetEigenDevice<Place>()) = | ||
temp.constant(static_cast<T>(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use this way to clean in_x_grad
memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks
|
||
def test_check_grad(self): | ||
print self.inputs | ||
print self.outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
int data_idx = | ||
(batch_idx * size + channel_idx * featLen) * groups + feat_idx; | ||
T ele = maxout_process.initial(); | ||
for (int g = 0; g < groups; g++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
__global__ void KernelMaxOut(const int nthreads, const T* input_data, | ||
T* output_data, const int channels, | ||
const int input_height, const int input_width, | ||
int groups, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
in_x_grad->mutable_data<T>(context.GetPlace()); | ||
auto temp = framework::EigenVector<T>::Flatten(*in_x_grad); | ||
temp.device(context.GetEigenDevice<Place>()) = | ||
temp.constant(static_cast<T>(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, framework::Tensor& output, | ||
int groups, int num_channels, MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
更里面的调了,这个暂时不调了,不然要改的地方也很多,收益并不大。这里不是严格按照输入输出看的,input和output更针对矩阵
paddle/operators/math/maxouting.cc
Outdated
const framework::Tensor& input, | ||
framework::Tensor * output, | ||
int groups, | ||
MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我仍然觉得,这里没有必要使用MaxOutProcess
,定义的MaxOut
只在这里使用一次,为什么不能展开?直接实现在 line 52,加MaxOutProcess
反而复杂了。 建议去掉MaxOutProcess
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
const int output_channels = output->dims()[1]; | ||
|
||
int fea_size = input_height * input_width; | ||
// c_size mean output one batch size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_size means the output size of each sample.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
maxout_process.compute(ele, | ||
input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]); | ||
} | ||
output_data[(new_bindex+new_cindex+f)] = ele; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议去掉maxout_process, 这里直接比大小~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
:param layer_attr: Extra Layer attribute. | ||
:type layer_attr: ExtraLayerAttribute | ||
:return: LayerOutput object. | ||
:rtype: LayerOutput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
去掉line 61 - line 81,其他op目前都没有类似的用例。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.h
Outdated
limitations under the License. */ | ||
|
||
#pragma once | ||
#include "paddle/framework/eigen.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the unused header file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.h
Outdated
#include "paddle/framework/eigen.h" | ||
#include "paddle/framework/tensor.h" | ||
#include "paddle/platform/device_context.h" | ||
#include "paddle/platform/hostdevice.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If remove line 26 - line 48, please also remove this header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
int feat_idx = batch_offset % feat_len; | ||
int data_idx = | ||
(batch_idx * size + channel_idx * feat_len) * groups + feat_idx; | ||
int maxIndex = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxIndex -> max_index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
|
||
int groups = context.template Attr<int>("groups"); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
空行太多
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.op_type = "maxout" | ||
self.init_test_case() | ||
input = np.random.random(self.shape).astype("float32") | ||
output = self.MaxOut_forward_naive(input, self.groups, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.MaxOut_forward_naive -> maxOut_forward_naive 就行吧,这样可以去掉line 32。觉得定义 self.MaxOut_forward_naive 必要性不大。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为我看了一些相似的case,实际上是把计算单独拿出来了。所以才这样把它提出来
paddle/operators/math/maxouting.cc
Outdated
bool stop = false; | ||
int output_idx = blen + clen + f; | ||
for (int g = 0; g < groups && !stop; ++g) { | ||
input_idx = (blen + clen) * groups + fea_size * g + f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了减少多余的计算:
- 90行改成:
int input_idx0 = (blen + clen) * groups + f;
94行:int input_idx = input_idx0 + fea_size * g;
- 91行改成
bool continue = true
,并对应修改93行和98行。这样93行每次判断,不用多做一次非操作。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
MaxOutProcess maxout_process) { | ||
const int size = input_height * input_width * channels / groups; | ||
const int feat_len = input_height * input_width; | ||
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30行的格式要换下,以减少重复计算:
int index = blockIdx.x * blockDim.x + threadIdx.x;
int offset = blockDim.x * gridDim.x;
for (int i = index; i < nthreads; i += offset)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
const int input_height, const int input_width, int groups) { | ||
const int size = input_height * input_width * channels / groups; | ||
const int feat_len = input_height * input_width; | ||
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
(batch_idx * size + channel_idx * feat_len) * groups + feat_idx; | ||
int maxIndex = -1; | ||
bool stop = false; | ||
for (int g = 0; g < groups && !stop; ++g) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
61行改成bool continue = true;
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.h
Outdated
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删去多余空行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
)DOC"); | ||
AddComment(R"DOC( | ||
- Input: NCHW. | ||
- Output: feature map size same as input. Channel is (input channel) / groups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段注释要修正下。
- feature map size same as input: 这句话没谓语。The feature map size of output is the same as the input
- The output_channel is (input channel) / groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
- Input: NCHW. | ||
- Output: feature map size same as input. Channel is (input channel) / groups. | ||
So groups should be larger than 1, and the num of channels should be able | ||
to devided by groups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be devided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
// check groups > 1 | ||
PADDLE_ENFORCE_GT( | ||
groups, 1, | ||
"in maxoutop groups should be larger than 1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groups should be larger than 1 in maxout op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
maxout_forward; | ||
paddle::operators::math::MaxOut<T> maxout_process; | ||
maxout_forward(context.device_context(), *in_x, out, groups, | ||
maxout_process); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请安装pre-commit,36-41行的这段代码格式不对。
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除多余空行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks
paddle/operators/math/maxouting.cc
Outdated
const int output_channels = output->dims()[1]; | ||
|
||
int fea_size = input_height * input_width; | ||
// c_size mean output one batch size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
bool stop = false; | ||
int output_idx = blen + clen + f; | ||
for (int g = 0; g < groups && !stop; ++g) { | ||
input_idx = (blen + clen) * groups + fea_size * g + f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
MaxOutProcess maxout_process) { | ||
const int size = input_height * input_width * channels / groups; | ||
const int feat_len = input_height * input_width; | ||
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
const int input_height, const int input_width, int groups) { | ||
const int size = input_height * input_width * channels / groups; | ||
const int feat_len = input_height * input_width; | ||
for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cu
Outdated
(batch_idx * size + channel_idx * feat_len) * groups + feat_idx; | ||
int maxIndex = -1; | ||
bool stop = false; | ||
for (int g = 0; g < groups && !stop; ++g) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
|
||
int groups = context.template Attr<int>("groups"); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
So groups should be larger than 1, and the num of channels should be able | ||
to devided by groups. | ||
|
||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.h
Outdated
T scale) { | ||
dx += dy * (x == y); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
maxout_process.compute(ele, | ||
input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]); | ||
} | ||
output_data[(new_bindex+new_cindex+f)] = ele; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
const framework::Tensor& input, | ||
framework::Tensor * output, | ||
int groups, | ||
MaxOutProcess maxout_process) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
// T ele = maxout_process.initial(); | ||
T ele = static_cast<T>(-FLT_MAX); | ||
for (int ph = 0; ph < groups; ++ph) { | ||
T x = input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you do not install pre-commit
, the code style is not right. The space is needed before and after +, *.
Install pre-ommit
by running following command in your PaddlePaddle root path:
pip install pre-commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, | ||
framework::Tensor& input_grad, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
framework::Tensor& input_grad
-> framework::Tensor* input_grad
, Maybe better to put the output as the last arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
for (int c = 0; c < output_channels; ++c) { | ||
int new_cindex = fea_size * c; | ||
for (int f = 0; f < fea_size; ++f) { | ||
// T ele = maxout_process.initial(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
/* | ||
* All tensors are in NCHW format. | ||
* groups mustbe > 1 | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// All tensors are in NCHW format, and the groups must be greater than 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
"width of feature."); | ||
AddAttr<int>( | ||
"groups", | ||
R"DOC(The group number of input layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifies how many groups the input tensor will be split in the channel dimension. And the number of output channel is the number of channels divided by groups
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
paddle::operators::math::MaxOutGradFunctor<Place, T> | ||
maxout_backward; | ||
maxout_backward(context.device_context(), *in_x, *in_x_grad, *out, | ||
*out_grad, groups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent line 59.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
from op_test import OpTest | ||
|
||
|
||
def maxout_forward_naive(input, groups,num_channels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The num_channels
is not used, please to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.num_channels).astype("float32") | ||
|
||
self.inputs = {'X': input} | ||
self.attrs = {'groups': self.groups, 'num_channels': self.num_channels} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attr num_channels
has been removed in C++ code. Please to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.MaxOut_forward_naive = maxout_forward_naive | ||
self.shape = [100, 6, 2, 2] | ||
self.groups=2 | ||
self.num_channels=6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cu
Outdated
See the License for the specific language governing permissions and | ||
limitations under the License. */ | ||
|
||
#define EIGEN_USE_GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this line.
And since the PR#5573, please to change paddle/operators/maxout_op.cu
to paddle/operators/maxout_op.cu.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
/* | ||
* All tensors are in NCHW format. | ||
* groups mustbe > 1 | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
for (int c = 0; c < output_channels; ++c) { | ||
int new_cindex = fea_size * c; | ||
for (int f = 0; f < fea_size; ++f) { | ||
// T ele = maxout_process.initial(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
// T ele = maxout_process.initial(); | ||
T ele = static_cast<T>(-FLT_MAX); | ||
for (int ph = 0; ph < groups; ++ph) { | ||
T x = input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
public: | ||
void operator()(const platform::DeviceContext& context, | ||
const framework::Tensor& input, | ||
framework::Tensor& input_grad, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cc
Outdated
"width of feature."); | ||
AddAttr<int>( | ||
"groups", | ||
R"DOC(The group number of input layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
in_x_grad->mutable_data<T>(context.GetPlace()); | ||
zero(device_ctx, in_x_grad, static_cast<T>(0.0)); | ||
paddle::operators::math::MaxOutGradFunctor<Place, T> | ||
maxout_backward; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.h
Outdated
paddle::operators::math::MaxOutGradFunctor<Place, T> | ||
maxout_backward; | ||
maxout_backward(context.device_context(), *in_x, *in_x_grad, *out, | ||
*out_grad, groups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
from op_test import OpTest | ||
|
||
|
||
def maxout_forward_naive(input, groups,num_channels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.num_channels).astype("float32") | ||
|
||
self.inputs = {'X': input} | ||
self.attrs = {'groups': self.groups, 'num_channels': self.num_channels} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.MaxOut_forward_naive = maxout_forward_naive | ||
self.shape = [100, 6, 2, 2] | ||
self.groups=2 | ||
self.num_channels=6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
int output_idx = blen + clen + f; | ||
for (int g = 0; g < groups && continue_match; ++g) { | ||
int input_idx = input_idx0 + fea_size * g; | ||
input_grad_data[input_idx] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove line 88.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
初始化为一个值,为什么需要去掉,有时候内存里往往有脏数据呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你可在循环外面像这样初始化
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_idx在for循环里是变化的。除非在外面再memset了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
const int batch_size = input.dims()[0]; | ||
const int input_height = input.dims()[2]; | ||
const int input_width = input.dims()[3]; | ||
const int output_channels = output->dims()[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check whether output_channels
and input.dims()[1] / groups
are equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个在外面是这么初始化出来的,所以才没有再检查
paddle/operators/math/maxouting.cc
Outdated
input_grad_data[input_idx] = 0; | ||
if (input_data[input_idx] == output_data[output_idx]) { | ||
input_grad_data[input_idx] += output_grad_data[output_idx]; | ||
continue_match = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't think you should do cumulative operations here.
input_grad_data[input_idx] += output_grad_data[output_idx]
- You can replace
continue_match = false;
withbreak;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for (int g = 0; g < groups && continue_match; ++g) { | ||
if (input_data[data_idx + g * feat_len] == output_data[i]) { | ||
max_index = data_idx + g * feat_len; | ||
continue_match = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can replace continue_match = false;
with break;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (max_index != -1) { | ||
// atomic add | ||
platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think atomic operation is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看pooling里面用了,又查不到为什么,所以就继承过来了,并且这个被很多人review过,至今也还是这样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里确实不需要用原子操作,之前是因为没仔细看这一块,抱歉...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
float>); | ||
REGISTER_OP_CPU_KERNEL(maxout_grad, | ||
ops::MaxOutGradKernel<paddle::platform::CPUPlace, | ||
float>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to register double of type kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cu.cc
Outdated
float>); | ||
REGISTER_OP_GPU_KERNEL(maxout_grad, | ||
ops::MaxOutGradKernel<paddle::platform::GPUPlace, | ||
float>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/maxout_op.cu.cc
Outdated
float>); | ||
REGISTER_OP_GPU_KERNEL(maxout_grad, | ||
ops::MaxOutGradKernel<paddle::platform::GPUPlace, | ||
float>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
float>); | ||
REGISTER_OP_CPU_KERNEL(maxout_grad, | ||
ops::MaxOutGradKernel<paddle::platform::CPUPlace, | ||
float>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (max_index != -1) { | ||
// atomic add | ||
platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看pooling里面用了,又查不到为什么,所以就继承过来了,并且这个被很多人review过,至今也还是这样
for (int g = 0; g < groups && continue_match; ++g) { | ||
if (input_data[data_idx + g * feat_len] == output_data[i]) { | ||
max_index = data_idx + g * feat_len; | ||
continue_match = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/operators/math/maxouting.cc
Outdated
input_grad_data[input_idx] = 0; | ||
if (input_data[input_idx] == output_data[output_idx]) { | ||
input_grad_data[input_idx] += output_grad_data[output_idx]; | ||
continue_match = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done,还是希望可以一次把问题都review出来,这样不定时的找出一些问题,再分别修改提交,编译特别占用时间,每次至少半个小时能进行完都是快的!!!!!!
const int batch_size = input.dims()[0]; | ||
const int input_height = input.dims()[2]; | ||
const int input_width = input.dims()[3]; | ||
const int output_channels = output->dims()[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个在外面是这么初始化出来的,所以才没有再检查
if (max_index != -1) { | ||
// atomic add | ||
platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
为了增加maxout op修改。
但是其中为了编译到在op的math路径里面的代码而修改了op里面的CMakelist文件。(参照的pool)。
resolve #5570
resolve #3737