-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【PaddlePaddle Hackathon 2】8、为 Paddle 新增 nanmean API #40472
Conversation
Thanks for your contribution! |
PR格式检查通过,你的PR将接受Paddle专家以及开源社区的review,请及时关注PR动态。 |
python/paddle/tensor/math.py
Outdated
@@ -967,6 +967,81 @@ def nansum(x, axis=None, dtype=None, keepdim=False, name=None): | |||
return sum(tmp_tensor, axis, dtype, keepdim, name) | |||
|
|||
|
|||
def nanmean(x,axis=None,keepdim=None,name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keepdim=False is better?keep same with other API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for your suggestion! keep same with other API is the better way. i will update this
axis = [axis] | ||
if axis == None: | ||
return paddle.mean(x[~paddle.isnan(x)], keepdim=keepdim,name=name) | ||
check_variable_and_dtype(x, 'x/input', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just 'x' instead of 'x/input'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is refer to the code of paddle.mean.
'x/input' seems to be used only as a input name when raise a type/dtype error.
if axis == None: | ||
return paddle.mean(x[~paddle.isnan(x)], keepdim=keepdim,name=name) | ||
check_variable_and_dtype(x, 'x/input', | ||
['uint16', 'float16', 'float32', 'float64'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype should be the same to description of x above. eg: 'uint16' is not in "x (Tensor): The input Tensor with data type float32, float64."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is refer to the code of paddle.mean
Although it require the input tensor must with data type of float32/64, but it also check allow data type of 'uint16' and 'float16' code is here.
Because of the issus describe here(paddle.nanmean extends the functionality of the paddle.mean API), so i just follow the code of paddle.mean
python/paddle/tensor/math.py
Outdated
if axis == None: | ||
return paddle.mean(x[~paddle.isnan(x)], keepdim=keepdim,name=name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the logic of code L1041~L1042 below cover this branch? we need to handle the condition 'axis == None' alone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for your suggestion!
the logic of code L1041~L1042 below can cover this branch.
we don't need handle this condition alone.
i will take this advise and update this problem.
|
||
def setUp(self): | ||
self.x_shape = [2, 3, 4, 5] | ||
self.x = np.random.uniform(-1, 1, self.x_shape).astype(np.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.x does not have 'nan', we should cover all the test case in rfcs
also should include check gradient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your suggestion! i will cover all the test case in next update.
but i am confused in check gradient, because i can't find the example of check gradient in the paddle.mean test file test_mean_op.py .
I will appreciate it if you can give me a example.
hi~有些细节需要注意一下 @Li-fAngyU |
@Ligoml Done! |
修改了nanmean的axis参数的文档描述。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for docs
祝贺你,你的PR测试通过,后续将会纳入飞桨的发版计划中,感谢你对飞桨开发者社区的参与。 |
你的PR有最新反馈,请及时修改。 |
updata nanmean's sample code (:name: code-example1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
修改nanmean的example code 错误
update example code
update example code of nanmean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
APIs
Describe
ISSUE链接:#40327
RFC的PR链接:PaddlePaddle/community#48
中文文档PR链接:PaddlePaddle/docs#4294
为 Paddle 新增 nanmean API
paddle.nanmean 扩展了 paddle.mean API 的功能,如果输入Tensor中有nan值, paddle.mean在计算中会将涉及nan值的结果都置为nan,而 paddle.nanmean 会跳过nan值。比如输入数据 x = [[nan, 1. , 2. ], [3. , 4. , 5. ]],x.mean() 得到 [nan],x.mean(0) 得到 [nan, 2.5, 3.5],x.nanmean() 得到 [3.],x.nanmean(0) 得到 [3., 2.5, 3.5]。此API需支持的调用路径为:paddle.nanmean 和 Tensor.nanmean 。