Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. #34436

ZzSean · 2021-07-28T02:47:26Z

PR types

Performance optimization

PR changes

OPs

Describe

Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny
解决了在brocast反向时，reduce最后一维很小时性能差的问题，且优化了 ReduceLastDim的实现，提升了部分case性能。

case	axis	pytorch	优化前	优化后	优化后相比pytorch	加速比
[4, 2048, 64, 128]	[2, 3]	305.15us	328.35us	303.54us	打平 (0.53%)	1.08
[16, 2048, 7, 7]	[2, 3]	21.738us	49.712us	16.133us	优于 (34.74%)	3.08
[512, 896, 4, 12]	[3]	214.61us	2.5947ms	166.01us	优于 (29.28%)	15.63

… ReduceAny

paddle-bot-old · 2021-07-28T02:47:29Z

✅ This PR's description meets the template requirements!
Please wait for other CI results.

paddle-bot-old · 2021-07-28T02:47:31Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

AnnaTrainingG · 2021-07-30T03:31:39Z

LGTM

Xreki

LGTM

Xreki · 2021-08-02T06:36:42Z

paddle/fluid/operators/reduce_ops/reduce_op.cu.h

@@ -524,18 +531,20 @@ static __device__ T WarpReduce(T val, ReduceOp reducer) {
 template <typename T, typename ReduceOp>
 static __device__ T BlockXReduce(T val, ReduceOp reducer) {
  using detail::kWarpSize;
-  __shared__ T shared[kWarpSize];
+  __shared__ T shared[2 * kWarpSize];


为什么是2 * kWarpSize？后续可以考虑继续优化下这个函数，使得它可以作为一个基础函数，可以适用于如softmax、batch_norm等算子的使用场景。

Unify the block/grid strategy and implementation of ReduceLastDim and…

a76e9a5

… ReduceAny

Xreki changed the title ~~Unify the block/grid strategy and implementation of ReduceLastDim and…~~ Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. Aug 2, 2021

Xreki approved these changes Aug 2, 2021

View reviewed changes

Xreki merged commit c7cc5ac into PaddlePaddle:develop Aug 2, 2021

ZzSean deleted the opt_reduce branch September 3, 2021 02:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. #34436

Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. #34436

ZzSean commented Jul 28, 2021 •

edited

Loading

paddle-bot-old bot commented Jul 28, 2021 •

edited

Loading

paddle-bot-old bot commented Jul 28, 2021

AnnaTrainingG commented Jul 30, 2021

Xreki left a comment

Xreki Aug 2, 2021 •

edited

Loading

Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. #34436

Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny. #34436

Conversation

ZzSean commented Jul 28, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Jul 28, 2021 • edited Loading

paddle-bot-old bot commented Jul 28, 2021

AnnaTrainingG commented Jul 30, 2021

Xreki left a comment

Choose a reason for hiding this comment

Xreki Aug 2, 2021 • edited Loading

Choose a reason for hiding this comment

ZzSean commented Jul 28, 2021 •

edited

Loading

paddle-bot-old bot commented Jul 28, 2021 •

edited

Loading

Xreki Aug 2, 2021 •

edited

Loading