【PaddlePaddle Hackathon 3 No.33】为 Paddle 优化 erfinv op 在 GPU 上的计算性能 #45057

thunder95 · 2022-08-10T14:31:20Z

PR types

Performance optimization

PR changes

OPs

Describe

目前 Paddle 内 erfinv 算子的 GPU 实现采用 Eigen 组合的模式，缺少 GPU Kernel，性能相对不足；可以基于飞桨已有的kps api基础上开发得到较高的性能提升。
设计文档: PaddlePaddle/community#199

开发环境：

设备：RTX 2070s
环境：CUDA10.2，cuDNN 7

优化方法
1.　(方案Ａ)参考Ｅigen，在cuda算子中先实现ndtri函数，进一步实现erfinv函数
2．(方案Ｂ)直接基于cuda提供的内置api函数进行开发

　基于飞桨团队已实现的elementwisekernel，得到较明显的性能提升

完成优化后，Paddle与优化前的Paddle的前向推理性能对比效果:

方案	Case No.	input_shape	paddle Perf(ms)	old_paddle Perf(ms)	ratio
A	0	[16, 204800]	0.1556	0.1302	0.8368 x
A	1	[10, 20, 30, 40, 5, 6]	8.6268	7.9096	0.9169 x
B	0	[16, 204800]	0.067831	0.1302	2.2939 x
B	1	[10, 20, 30, 40, 5, 6]	2.76477	7.9096	3.1202 x

完成优化后，Paddle与Pytorch的前向推理性能对比效果:

方案	Case No.	input_shape	paddle Perf(ms)	pytorch Perf(ms)	ratio
A	0	[16, 204800]	0.1556	0.0832	0.5347 x
A	1	[10, 20, 30, 40, 5, 6]	8.6268	2.7898	0.3234 x
B	0	[16, 204800]	0.067831	0.0832	1.2266 x
B	1	[10, 20, 30, 40, 5, 6]	2.76477	2.7898	1.0091 x

方案Ａ实现较为复杂，反而性能还有所降低，故本ＰＲ采用方案Ｂ。

paddle-bot · 2022-08-10T14:31:27Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

ZzSean · 2022-08-17T03:44:33Z

paddle/phi/kernels/gpu/erfinv_kernel.cu

+namespace phi {
+
+template <typename T>
+struct ErfinvCUDAFunctor {


直接叫ErfinvFunctor就可以

嗯已修改

ZzSean · 2022-08-17T03:48:30Z

paddle/phi/kernels/gpu/erfinv_kernel.cu

+
+template <typename T>
+struct ErfinvCUDAFunctor {
+  HOSTDEVICE inline ErfinvCUDAFunctor() {}


默认构造为空的话可以省略

谢谢建议，已移除

@ZzSean 辛苦老师再看一下

… erfinv

ZzSean

LGTM

erfinv

308fe8a

paddle-bot bot added contributor External developers status: proposed labels Aug 10, 2022

thunder95 mentioned this pull request Aug 10, 2022

【PaddlePaddle Hackathon 第三期】任务总览 #43938

Closed

luotao1 assigned luotao1 and JamesLim-sy Aug 11, 2022

luotao1 added the PaddlePaddle Hackathon label Aug 11, 2022

luotao1 assigned Ligoml and ZzSean Aug 11, 2022

ZzSean reviewed Aug 17, 2022

View reviewed changes

thunder95 added 3 commits August 17, 2022 05:45

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

e4ff8ff

… erfinv

fix some tiny issues

56da149

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

fb76623

… erfinv

ZzSean approved these changes Aug 19, 2022

View reviewed changes

ZzSean merged commit 0e384ad into PaddlePaddle:develop Aug 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【PaddlePaddle Hackathon 3 No.33】为 Paddle 优化 erfinv op 在 GPU 上的计算性能 #45057

【PaddlePaddle Hackathon 3 No.33】为 Paddle 优化 erfinv op 在 GPU 上的计算性能 #45057

thunder95 commented Aug 10, 2022

paddle-bot bot commented Aug 10, 2022

ZzSean Aug 17, 2022

thunder95 Aug 17, 2022

ZzSean Aug 17, 2022

thunder95 Aug 17, 2022

thunder95 Aug 17, 2022

ZzSean left a comment

【PaddlePaddle Hackathon 3 No.33】为 Paddle 优化 erfinv op 在 GPU 上的计算性能 #45057

【PaddlePaddle Hackathon 3 No.33】为 Paddle 优化 erfinv op 在 GPU 上的计算性能 #45057

Conversation

thunder95 commented Aug 10, 2022

PR types

PR changes

Describe

paddle-bot bot commented Aug 10, 2022

ZzSean Aug 17, 2022

Choose a reason for hiding this comment

thunder95 Aug 17, 2022

Choose a reason for hiding this comment

ZzSean Aug 17, 2022

Choose a reason for hiding this comment

thunder95 Aug 17, 2022

Choose a reason for hiding this comment

thunder95 Aug 17, 2022

Choose a reason for hiding this comment

ZzSean left a comment

Choose a reason for hiding this comment