Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor operator python test framework and add sum operator #3882

Merged
merged 6 commits into from
Sep 11, 2017

Conversation

QiJune
Copy link
Member

@QiJune QiJune commented Sep 5, 2017

Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. Otherwise LGTM!

@@ -311,9 +332,9 @@ class InferShapeContext {
}

template <typename T>
std::vector<const T*> MultiOutput(const std::string& name) const {
std::vector<T*> MultiOutput(const std::string& name) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, should we name this DuplicableOutput?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think DuplicableOutput is more accerate

using Tensor = framework::Tensor;
template <typename T, int MajorType = Eigen::RowMajor,
typename IndexType = Eigen::DenseIndex>
using EigenVector = framework::EigenVector<T, MajorType, IndexType>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us don't use using in header files. https://google.github.io/styleguide/cppguide.html#Namespaces

I think we can move this using directive into class SumKernel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, all the operators have the same problem. I will make another pr to move using in head file in all operators

return result;
}

const std::vector<std::string> OutputsNames() const {
Copy link
Collaborator

@reyoung reyoung Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that same as OutputVars method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the same. I will remove this method

auto ins = ctx.MultiInput<framework::Tensor>("X");
auto *out = ctx.Output<framework::Tensor>("Out");
int N = ins.size();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here should check all dims of inputs are same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

result.device(place) = in;
for (int i = 1; i < N; i++) {
auto in = EigenVector<T>::Flatten(*(ins[i]));
result.device(place) = result + in;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is very slow because it starts many GPU kernels. Maybe we could have a better implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will make a more efficient implementation in next PR. This PR is mainly focus on supporting multi-inputs/outputs.

import paddle.v2.framework.core as core
from paddle.v2.framework.op import Operator


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "@Grad" should be a global variable.


for ins in Operator.get_op_inputs(op_type):
in_name = ins[0]
in_dup = ins[1]
Copy link
Collaborator

@reyoung reyoung Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for in_name, in_dup in Operator.get_op_inputs(op_type):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

x0 = np.random.random((3, 4)).astype('float32')
x1 = np.random.random((3, 4)).astype('float32')
x2 = np.random.random((3, 4)).astype('float32')
self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe the user does not want to specify names about x0, x1, x2.

Just,

self.inputs = {
  "X": [x0, x1, x2]
}

is OK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just,

self.inputs = {
"X": [x0, x1, x2]
}
is OK.

If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung @qingqing01 In grad check process, we can chose a specific input. Such as

self.check_grad(["x0"], "Out")

So, explicitly set a name is more flexible.

tensor = var.get_tensor()
kwargs[out_name].append(out_name)

# for attr_name in Operator.get_op_attr_names(op_type):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So where are the attributes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



def remove_grad_var_name(var_name):
return var_name[0:-5]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove_grad_var_name is not used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it.

kwargs[in_name].append(sub_in_name)
else:
var = scope.new_var(in_name)
tensor = var.get_tensor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove tensor = var.get_tensor(), only need to new var.

Also remove line 28.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will remove it later

kwargs[out_name].append(sun_in_name)
else:
var = scope.new_var(out_name)
tensor = var.get_tensor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove line 44 and line 48

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return Operator(op_type, **kwargs)


def set_input(scope, op, inputs, place):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set_input can be reused in create_op.

# add delta to it, run op and then get the sum of the result tensor.
x_pos = origin + delta
tensor_to_check.set_float_element(i, x_pos)
y_pos = get_output()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after line 136, need to add: @reyoung

tensor_to_check.set_float_element(i, origin)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gradient_check.py does not have this line.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qingqing01 It seems that it is not needed.

var.get_tensor()
for output in backward_op.outputs_names():
var = scope.new_var(output)
var.get_tensor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove line 155 and line 158

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

x0 = np.random.random((3, 4)).astype('float32')
x1 = np.random.random((3, 4)).astype('float32')
x2 = np.random.random((3, 4)).astype('float32')
self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just,

self.inputs = {
"X": [x0, x1, x2]
}
is OK.

If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?

in_dup = ins[1]
if in_name in inputs:
kwargs[in_name] = []
if in_dup:
Copy link
Contributor

@qingqing01 qingqing01 Sep 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在C++ Operators里面,是否为duplicate,inputs/outputs的存储格式都一样,为:

using VariableNameMap = std::map<std::string, std::vector<std::string>>;

所以,Python里也可以统一格式,不用区分是否in_dup, 但会导致如下:

  • 一个输入写成: inputs = {'X': ['X']} (一个也得写成['X'])
  • 多个输入(假如2个)写成: inputs = {'X': ['X1', 'X2']}

(这里假设ProtoMaker里注册的Key为“X”)

这样很多地方都不用if in_dup: else判断了吧。

Copy link
Member Author

@QiJune QiJune Sep 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确实是的。但大部分的operator不是duplicate的输入,可以让用户的配置稍微简单一些。
现在的处理逻辑是:
1 如果是一般的输入输出,那么用户直接给一个numpy array就可以了,测试框架会自动把key当做variable的名字。好处是用户的配置稍微简化一些。
2 如果是duplicate的输入输出,那么用户需要给定一个 {"name1": array1, "name2": array2}的字典。测试框架会拿到每一个名字。好处是用户可以指定check具体哪一个输入的grad,比如name1的值发生变化,对输出结果的变化。

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except gradient of sum could be composed by many identity operators.

@QiJune
Copy link
Member Author

QiJune commented Sep 11, 2017

@reyoung I will merge this PR first, and will refine both forward and backward implementation of sum operator in next PR.

@QiJune QiJune merged commit c169669 into PaddlePaddle:develop Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants