fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816

Hongqing-work · 2024-04-24T07:33:25Z

PR Category

Others

PR Types

Bug fixes

Description

Pcard-67164
This PR fixed FlashAttnOpInferSymbolicShape and FlashAttnInferMeta by adding shape inferring for softmax, softmax_lse and seed_offset.

paddle-bot · 2024-04-24T07:33:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zyfncg · 2024-04-24T07:57:24Z

paddle/phi/infermeta/ternary.cc

+  auto batch_size = q.dims()[0];
+  auto num_heads = q.dims()[2];
+  auto seqlen_q_rounded = round_multiple(q.dims()[1]);


这里简单的int类型最好还是用原始的数据类型，auto表示不够直观，会提高阅读成本，增加犯错的概率

phlrain · 2024-04-25T02:18:41Z

paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc

@@ -287,6 +290,35 @@ bool FlashAttnOpInferSymbolicShape(

  shape_analysis->SetShapeOrDataForValue(
      op->result(0), symbol::TensorShapeOrDataDimExprs(out_shape));
+
+  auto round_multiple = [](symbol::DimExpr x) {
+    auto m = symbol::DimExpr{128};


这个地方为啥选择128，是kernel里面写的128么

cuda kernel用了基于128的round，但是XPU没有，这里暂时和cuda kernel保持一致并增加备注。

phlrain · 2024-04-25T02:19:12Z

paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc

+    auto m_minus_one = symbol::DimExpr{127};
+    return (x + m_minus_one) / m * m;
+  };
+  auto batch_size_expr = q.shape()[0];


能够加一个 q.shape size 的检查，防止传入了错误的输入，这里直接core dump

…e#63816) * fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta * use int for simple value * add check and constraint * fix * add shape constraint for attention_mask

fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta

1705578

zyfncg reviewed Apr 24, 2024

View reviewed changes

use int for simple value

56c83e5

zyfncg previously approved these changes Apr 25, 2024

View reviewed changes

phlrain reviewed Apr 25, 2024

View reviewed changes

add check and constraint

5d20737

Hongqing-work dismissed zyfncg’s stale review via 5d20737 April 25, 2024 07:53

Hongqing-work added 2 commits April 25, 2024 08:14

fix

41db315

add shape constraint for attention_mask

38ab53b

phlrain self-requested a review April 26, 2024 03:25

phlrain approved these changes Apr 26, 2024

View reviewed changes

zyfncg merged commit 52db8e4 into PaddlePaddle:develop Apr 26, 2024
28 of 30 checks passed

Hongqing-work deleted the fix-flash-attn-infer-shape branch May 10, 2024 06:22

gongshaotian mentioned this pull request Jul 25, 2024

【快乐开源】CINN编译器符号推导扩量 #66444

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816

fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816

Hongqing-work commented Apr 24, 2024

paddle-bot bot commented Apr 24, 2024

zyfncg Apr 24, 2024

Hongqing-work Apr 24, 2024

phlrain Apr 25, 2024

Hongqing-work Apr 25, 2024

phlrain Apr 25, 2024

Hongqing-work Apr 25, 2024

fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816

fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816

Conversation

Hongqing-work commented Apr 24, 2024

PR Category

PR Types

Description

paddle-bot bot commented Apr 24, 2024

zyfncg Apr 24, 2024

Choose a reason for hiding this comment

Hongqing-work Apr 24, 2024

Choose a reason for hiding this comment

phlrain Apr 25, 2024

Choose a reason for hiding this comment

Hongqing-work Apr 25, 2024

Choose a reason for hiding this comment

phlrain Apr 25, 2024

Choose a reason for hiding this comment

Hongqing-work Apr 25, 2024

Choose a reason for hiding this comment