-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support bias is none for fused_attention op. #37411
Add support bias is none for fused_attention op. #37411
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请更新具体修改的 Op 定义,每个Op 是否会用到模型推理中
如:
修改 Op定义如下:
1. xx OP, 将 xxattr 改为,会作用到推理
2. xx Op,增加 xxx attr,不会作用到推理
整体结论:修改的Op [会/不会] 影响推理
所做修改对推理的影响:
训练侧新增的fuse op对推理带来的整体影响,会在后续提一个defusion的pr来专门处理。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Recently, due to the usage of fusion op in the dygraph program, the jit module will expose the fusion op directlly to inference model, and this will result in the chaotic in the operator granularity in inference model representation.
This requires some new functionality like defusion
in jit module, and other members will continue working on this.
Considering this PR is just a bugfix, it is approved.
Add support for bias is none for fused_attention op.
Add support for bias is none for fused_attention op.
PR types
Bug fixes
PR changes
OPs
Describe
Fix bus of fused_attention op.
1.Add support for bias is none, as well as corresponding unittest.
2.Add assert for input shape limitations in case of API misuse, where
num_head * head_dim
should equal toembed_dim
.Results of related unittests: