Refactor operator python test framework and add sum operator #3882

QiJune · 2017-09-05T11:36:59Z

Fix operator python test needs to support duplicate inputs/outputs #3886, support duplicate inputs/outputs.
implement sum operator to test the refactored operator python test framework.
reduce codes and simplify user's configuration.

wangkuiyi

A few comments. Otherwise LGTM!

wangkuiyi · 2017-09-06T07:04:44Z

paddle/framework/operator.h

@@ -311,9 +332,9 @@ class InferShapeContext {
  }

  template <typename T>
-  std::vector<const T*> MultiOutput(const std::string& name) const {
+  std::vector<T*> MultiOutput(const std::string& name) const {


I am not sure, should we name this DuplicableOutput?

Yes, I think DuplicableOutput is more accerate

wangkuiyi · 2017-09-06T07:06:05Z

paddle/operators/sum_op.h

+using Tensor = framework::Tensor;
+template <typename T, int MajorType = Eigen::RowMajor,
+          typename IndexType = Eigen::DenseIndex>
+using EigenVector = framework::EigenVector<T, MajorType, IndexType>;


Let us don't use using in header files. https://google.github.io/styleguide/cppguide.html#Namespaces

I think we can move this using directive into class SumKernel.

Now, all the operators have the same problem. I will make another pr to move using in head file in all operators

reyoung · 2017-09-06T16:59:22Z

paddle/framework/operator.h

+    return result;
+  }
+
+  const std::vector<std::string> OutputsNames() const {


Is that same as OutputVars method?

It seems the same. I will remove this method

reyoung · 2017-09-06T17:00:55Z

paddle/operators/sum_op.cc

+    auto ins = ctx.MultiInput<framework::Tensor>("X");
+    auto *out = ctx.Output<framework::Tensor>("Out");
+    int N = ins.size();
+


Here should check all dims of inputs are same.

reyoung · 2017-09-06T17:02:33Z

paddle/operators/sum_op.h

+    result.device(place) = in;
+    for (int i = 1; i < N; i++) {
+      auto in = EigenVector<T>::Flatten(*(ins[i]));
+      result.device(place) = result + in;


This implementation is very slow because it starts many GPU kernels. Maybe we could have a better implementation.

Yes, I will make a more efficient implementation in next PR. This PR is mainly focus on supporting multi-inputs/outputs.

reyoung · 2017-09-06T17:05:19Z

python/paddle/v2/framework/tests/op_test.py

+import paddle.v2.framework.core as core
+from paddle.v2.framework.op import Operator
+
+


Maybe "@Grad" should be a global variable.

reyoung · 2017-09-06T17:06:12Z

python/paddle/v2/framework/tests/op_test.py

+
+    for ins in Operator.get_op_inputs(op_type):
+        in_name = ins[0]
+        in_dup = ins[1]


for in_name, in_dup in Operator.get_op_inputs(op_type):

reyoung · 2017-09-06T17:09:10Z

python/paddle/v2/framework/tests/test_sum_op.py

+        x0 = np.random.random((3, 4)).astype('float32')
+        x1 = np.random.random((3, 4)).astype('float32')
+        x2 = np.random.random((3, 4)).astype('float32')
+        self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}}


I think maybe the user does not want to specify names about x0, x1, x2.

Just,

self.inputs = { "X": [x0, x1, x2] }

is OK.

Just,

self.inputs = {
"X": [x0, x1, x2]
}
is OK.

If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?

@reyoung @qingqing01 In grad check process, we can chose a specific input. Such as

self.check_grad(["x0"], "Out")

So, explicitly set a name is more flexible.

reyoung · 2017-09-06T17:12:01Z

python/paddle/v2/framework/tests/op_test.py

+                tensor = var.get_tensor()
+                kwargs[out_name].append(out_name)
+
+    # for attr_name in Operator.get_op_attr_names(op_type):


So where are the attributes?

qingqing01 · 2017-09-07T12:45:56Z

python/paddle/v2/framework/tests/op_test.py

+
+
+def remove_grad_var_name(var_name):
+    return var_name[0:-5]


The remove_grad_var_name is not used?

I will remove it.

qingqing01 · 2017-09-07T12:48:04Z

python/paddle/v2/framework/tests/op_test.py

+                    kwargs[in_name].append(sub_in_name)
+            else:
+                var = scope.new_var(in_name)
+                tensor = var.get_tensor()


Remove tensor = var.get_tensor(), only need to new var.

Also remove line 28.

Yes, I will remove it later

qingqing01 · 2017-09-07T12:48:19Z

python/paddle/v2/framework/tests/op_test.py

+                    kwargs[out_name].append(sun_in_name)
+            else:
+                var = scope.new_var(out_name)
+                tensor = var.get_tensor()


remove line 44 and line 48

qingqing01 · 2017-09-07T12:50:28Z

python/paddle/v2/framework/tests/op_test.py

+    return Operator(op_type, **kwargs)
+
+
+def set_input(scope, op, inputs, place):


The set_input can be reused in create_op.

qingqing01 · 2017-09-07T12:57:21Z

python/paddle/v2/framework/tests/op_test.py

+        # add delta to it, run op and then get the sum of the result tensor.
+        x_pos = origin + delta
+        tensor_to_check.set_float_element(i, x_pos)
+        y_pos = get_output()


after line 136, need to add: @reyoung

tensor_to_check.set_float_element(i, origin)

The gradient_check.py does not have this line.

@qingqing01 It seems that it is not needed.

qingqing01 · 2017-09-07T12:58:35Z

python/paddle/v2/framework/tests/op_test.py

+        var.get_tensor()
+    for output in backward_op.outputs_names():
+        var = scope.new_var(output)
+        var.get_tensor()


remove line 155 and line 158

qingqing01 · 2017-09-07T13:02:54Z

python/paddle/v2/framework/tests/test_sum_op.py

+        x0 = np.random.random((3, 4)).astype('float32')
+        x1 = np.random.random((3, 4)).astype('float32')
+        x2 = np.random.random((3, 4)).astype('float32')
+        self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}}


Just,

self.inputs = {
"X": [x0, x1, x2]
}
is OK.

If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?

qingqing01 · 2017-09-07T13:15:42Z

python/paddle/v2/framework/tests/op_test.py

+        in_dup = ins[1]
+        if in_name in inputs:
+            kwargs[in_name] = []
+            if in_dup:


在C++ Operators里面，是否为duplicate，inputs/outputs的存储格式都一样，为：

using VariableNameMap = std::map<std::string, std::vector<std::string>>;

所以，Python里也可以统一格式，不用区分是否in_dup, 但会导致如下：

一个输入写成： inputs = {'X': ['X']} (一个也得写成['X'])

多个输入(假如2个)写成： inputs = {'X': ['X1', 'X2']}

(这里假设ProtoMaker里注册的Key为“X”)

这样很多地方都不用if in_dup: else判断了吧。

确实是的。但大部分的operator不是duplicate的输入，可以让用户的配置稍微简单一些。
现在的处理逻辑是：
1 如果是一般的输入输出，那么用户直接给一个numpy array就可以了，测试框架会自动把key当做variable的名字。好处是用户的配置稍微简化一些。
2 如果是duplicate的输入输出，那么用户需要给定一个 {"name1": array1, "name2": array2}的字典。测试框架会拿到每一个名字。好处是用户可以指定check具体哪一个输入的grad，比如name1的值发生变化，对输出结果的变化。

reyoung

LGTM, except gradient of sum could be composed by many identity operators.

QiJune · 2017-09-11T03:16:22Z

@reyoung I will merge this PR first, and will refine both forward and backward implementation of sum operator in next PR.

QiJune added 2 commits September 5, 2017 19:32

refactor operator python test and add sum operator

f314330

fix bug

a22606c

QiJune requested review from reyoung, jacquesqiao and qingqing01 September 5, 2017 12:54

qingqing01 added the OpPorting label Sep 5, 2017

wangkuiyi reviewed Sep 6, 2017

View reviewed changes

reyoung reviewed Sep 6, 2017

View reviewed changes

qingqing01 reviewed Sep 7, 2017

View reviewed changes

QiJune added 4 commits September 8, 2017 14:54

follow comments

f50e36e

merge baidu/develop

090b811

fix gou test bug

15627e4

fix code style

c019288

reyoung approved these changes Sep 11, 2017

View reviewed changes

QiJune merged commit c169669 into PaddlePaddle:develop Sep 11, 2017

		import paddle.v2.framework.core as core
		from paddle.v2.framework.op import Operator

		return Operator(op_type, **kwargs)


		def set_input(scope, op, inputs, place):

Refactor operator python test framework and add sum operator #3882

Refactor operator python test framework and add sum operator #3882

Conversation

QiJune commented Sep 5, 2017 • edited Loading

wangkuiyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung Sep 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung Sep 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 Sep 7, 2017 • edited Loading

Choose a reason for hiding this comment

QiJune Sep 8, 2017 • edited Loading

Choose a reason for hiding this comment

reyoung left a comment

Choose a reason for hiding this comment

QiJune commented Sep 11, 2017

QiJune commented Sep 5, 2017 •

edited

Loading

reyoung Sep 6, 2017 •

edited

Loading

reyoung Sep 6, 2017 •

edited

Loading

qingqing01 Sep 7, 2017 •

edited

Loading

QiJune Sep 8, 2017 •

edited

Loading