-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon No.50】为 Paddle lerp 算子实现 float16 数据类型支持 #50925
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
|
y = np.full(600, 1.0).astype(np.float16).reshape([50, 2, 2, 3]) | ||
w = np.random.random([3]).astype(np.float16) | ||
self.inputs = {'X': x, 'Y': y, 'Weight': w} | ||
self.outputs = {'Out': x + w * (y - x)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这3个单测可以保留1个就可以,因为不同的shape在Kernel实现上并没有区分处理。
- 代码可以简化:因为继承了TestLerp,这里只需要为fp16的case实现 init_dtype和init_shape就可以
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
您好
- 这三个单测我保留了两个,这两个单侧在反向的kernel中实现是不同的。
- 测试代码已经按照要求简化。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面的5个case和TestLerpWithFp16BroadWToXY 除了shape不同好像没有其他差异?x和y都是相同的shape,w.shape=[1],测试内容有些重复。上述5个case和TestLerpWithFp16BroadWToXY保留一个就可以。
另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY也可以直接重写init_shape和init_dtype简化吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
老师,上面五个case的和TestLerpWithFp16BroadWToXY是不同的。因为TestLerpWithFp16BroadWToXY的w.shape=[3],会走到broadcast w的反向分支。而上面五个case的w.shape=[1],会走到w为scalar的分支。
另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY重写init_shape和init_dtype也比较麻烦,因为TestLerpWihFp16BroadXY中X与Y的shape并不相同,然后TestLerpWithFp16BroadWToXY中w是random的shape=[3]的向量。不是基类中w.shape=[1]的标量。所以选择直接重写setup函数,不知道可不可以。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
明白了,那TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY可以保留。
其实还是可以继承已有的TestLerpOp,但是可以给init_shape增加一个设置w的shape的功能,对TestLerpOp稍作修改,这样会更通用,代码也能得到简化。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单测或许可以只保留3个case,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY,以及x和y相同shape,w.shape=[1]的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的老师,已经按照相关说明更改。在基类中增加了init_wshape、init_xyshape。其中init_xyshape用来对x、y shape不相等时初始化。并且简化了原来FP32的TestLerpBroadXY、TestLerpBroadWToXY。
记得提交中文文档修改,并在PR描述中引用 |
@denglianbin 可以优先处理下这个PR的问题 |
self.shape = [2, 1, 2, 5, 1, 5] | ||
|
||
def init_dtype(self): | ||
self.dtype = np.float16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面这5个case除了维度不同在调用的kernel上应该并没有区别。x、y的shape都一样,甚至元素数量都是100,从测试覆盖情况来看相同的测试内容保留一个规模最大的即可。原始的fp32单测写的太累赘了,不需要完全保证相同数量的case。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里面其实都是走的同一个LerpFuntion,这里之所以写了这么多分支是因为eigen实现,模版参数必须设置dim的数值。但其实走的是同一个函数,计算的方式上没有差异的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
y = np.full(600, 1.0).astype(np.float16).reshape([50, 2, 2, 3]) | ||
w = np.random.random([3]).astype(np.float16) | ||
self.inputs = {'X': x, 'Y': y, 'Weight': w} | ||
self.outputs = {'Out': x + w * (y - x)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面的5个case和TestLerpWithFp16BroadWToXY 除了shape不同好像没有其他差异?x和y都是相同的shape,w.shape=[1],测试内容有些重复。上述5个case和TestLerpWithFp16BroadWToXY保留一个就可以。
另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY也可以直接重写init_shape和init_dtype简化吧?
self.res_ref = self.x + self.w * (self.y - self.x) | ||
self.place = [] | ||
if core.is_compiled_with_cuda(): | ||
self.place.append(paddle.CUDAPlace(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestLerpAPIFp16这个是为了测试什么?继承的是unittest.TestCase,不会运行op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是忘记删除的,已经删除。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Others
Describe
性能数据(op benchmark)