-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor operator python test framework and add sum operator #3882
Conversation
QiJune
commented
Sep 5, 2017
•
edited
Loading
edited
- Fix operator python test needs to support duplicate inputs/outputs #3886, support duplicate inputs/outputs.
- implement sum operator to test the refactored operator python test framework.
- reduce codes and simplify user's configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments. Otherwise LGTM!
@@ -311,9 +332,9 @@ class InferShapeContext { | |||
} | |||
|
|||
template <typename T> | |||
std::vector<const T*> MultiOutput(const std::string& name) const { | |||
std::vector<T*> MultiOutput(const std::string& name) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, should we name this DuplicableOutput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think DuplicableOutput is more accerate
using Tensor = framework::Tensor; | ||
template <typename T, int MajorType = Eigen::RowMajor, | ||
typename IndexType = Eigen::DenseIndex> | ||
using EigenVector = framework::EigenVector<T, MajorType, IndexType>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us don't use using
in header files. https://google.github.io/styleguide/cppguide.html#Namespaces
I think we can move this using directive into class SumKernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, all the operators have the same problem. I will make another pr to move using in head file in all operators
paddle/framework/operator.h
Outdated
return result; | ||
} | ||
|
||
const std::vector<std::string> OutputsNames() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that same as OutputVars
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the same. I will remove this method
auto ins = ctx.MultiInput<framework::Tensor>("X"); | ||
auto *out = ctx.Output<framework::Tensor>("Out"); | ||
int N = ins.size(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here should check all dims of inputs are same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
result.device(place) = in; | ||
for (int i = 1; i < N; i++) { | ||
auto in = EigenVector<T>::Flatten(*(ins[i])); | ||
result.device(place) = result + in; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is very slow because it starts many GPU kernels. Maybe we could have a better implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will make a more efficient implementation in next PR. This PR is mainly focus on supporting multi-inputs/outputs.
import paddle.v2.framework.core as core | ||
from paddle.v2.framework.op import Operator | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "@Grad" should be a global variable.
|
||
for ins in Operator.get_op_inputs(op_type): | ||
in_name = ins[0] | ||
in_dup = ins[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for in_name, in_dup in Operator.get_op_inputs(op_type):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
x0 = np.random.random((3, 4)).astype('float32') | ||
x1 = np.random.random((3, 4)).astype('float32') | ||
x2 = np.random.random((3, 4)).astype('float32') | ||
self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe the user does not want to specify names about x0
, x1
, x2
.
Just,
self.inputs = {
"X": [x0, x1, x2]
}
is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just,
self.inputs = {
"X": [x0, x1, x2]
}
is OK.
If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reyoung @qingqing01 In grad check process, we can chose a specific input. Such as
self.check_grad(["x0"], "Out")
So, explicitly set a name is more flexible.
tensor = var.get_tensor() | ||
kwargs[out_name].append(out_name) | ||
|
||
# for attr_name in Operator.get_op_attr_names(op_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So where are the attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
|
||
def remove_grad_var_name(var_name): | ||
return var_name[0:-5] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remove_grad_var_name is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it.
kwargs[in_name].append(sub_in_name) | ||
else: | ||
var = scope.new_var(in_name) | ||
tensor = var.get_tensor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove tensor = var.get_tensor()
, only need to new var.
Also remove line 28.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will remove it later
kwargs[out_name].append(sun_in_name) | ||
else: | ||
var = scope.new_var(out_name) | ||
tensor = var.get_tensor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove line 44 and line 48
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return Operator(op_type, **kwargs) | ||
|
||
|
||
def set_input(scope, op, inputs, place): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set_input
can be reused in create_op
.
# add delta to it, run op and then get the sum of the result tensor. | ||
x_pos = origin + delta | ||
tensor_to_check.set_float_element(i, x_pos) | ||
y_pos = get_output() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after line 136, need to add: @reyoung
tensor_to_check.set_float_element(i, origin)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gradient_check.py does not have this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qingqing01 It seems that it is not needed.
var.get_tensor() | ||
for output in backward_op.outputs_names(): | ||
var = scope.new_var(output) | ||
var.get_tensor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove line 155 and line 158
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
x0 = np.random.random((3, 4)).astype('float32') | ||
x1 = np.random.random((3, 4)).astype('float32') | ||
x2 = np.random.random((3, 4)).astype('float32') | ||
self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just,
self.inputs = {
"X": [x0, x1, x2]
}
is OK.
If so, how to generate the multi-inputs name for "x0", "x1" and "x2"? Generate these names automatically according to 'X'?
in_dup = ins[1] | ||
if in_name in inputs: | ||
kwargs[in_name] = [] | ||
if in_dup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在C++ Operators里面,是否为duplicate,inputs/outputs的存储格式都一样,为:
using VariableNameMap = std::map<std::string, std::vector<std::string>>;
所以,Python里也可以统一格式,不用区分是否in_dup, 但会导致如下:
- 一个输入写成: inputs = {'X': ['X']} (一个也得写成['X'])
- 多个输入(假如2个)写成: inputs = {'X': ['X1', 'X2']}
(这里假设ProtoMaker里注册的Key为“X”)
这样很多地方都不用if in_dup: else
判断了吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实是的。但大部分的operator不是duplicate的输入,可以让用户的配置稍微简单一些。
现在的处理逻辑是:
1 如果是一般的输入输出,那么用户直接给一个numpy array就可以了,测试框架会自动把key当做variable的名字。好处是用户的配置稍微简化一些。
2 如果是duplicate的输入输出,那么用户需要给定一个 {"name1": array1, "name2": array2}的字典。测试框架会拿到每一个名字。好处是用户可以指定check具体哪一个输入的grad,比如name1的值发生变化,对输出结果的变化。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except gradient of sum could be composed by many identity operators.
@reyoung I will merge this PR first, and will refine both forward and backward implementation of sum operator in next PR. |