-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816
fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta #63816
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
paddle/phi/infermeta/ternary.cc
Outdated
auto batch_size = q.dims()[0]; | ||
auto num_heads = q.dims()[2]; | ||
auto seqlen_q_rounded = round_multiple(q.dims()[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里简单的int类型最好还是用原始的数据类型,auto表示不够直观,会提高阅读成本,增加犯错的概率
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -287,6 +290,35 @@ bool FlashAttnOpInferSymbolicShape( | |||
|
|||
shape_analysis->SetShapeOrDataForValue( | |||
op->result(0), symbol::TensorShapeOrDataDimExprs(out_shape)); | |||
|
|||
auto round_multiple = [](symbol::DimExpr x) { | |||
auto m = symbol::DimExpr{128}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个地方为啥 选择128, 是kernel里面写的128么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda kernel用了基于128的round,但是XPU没有,这里暂时和cuda kernel保持一致并增加备注。
auto m_minus_one = symbol::DimExpr{127}; | ||
return (x + m_minus_one) / m * m; | ||
}; | ||
auto batch_size_expr = q.shape()[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
能够加一个 q.shape size 的检查,防止传入了错误的输入,这里直接core dump
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…e#63816) * fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta * use int for simple value * add check and constraint * fix * add shape constraint for attention_mask
…e#63816) * fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta * use int for simple value * add check and constraint * fix * add shape constraint for attention_mask
…e#63816) * fix FlashAttnOpInferSymbolicShape and FlashAttnInferMeta * use int for simple value * add check and constraint * fix * add shape constraint for attention_mask
PR Category
Others
PR Types
Bug fixes
Description
Pcard-67164
This PR fixed FlashAttnOpInferSymbolicShape and FlashAttnInferMeta by adding shape inferring for softmax, softmax_lse and seed_offset.