Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
Operator Mechanism
PR Types
New features
Description
新增
paddle.incubate.nn.functional.fused_partial_rope(x, cos, sin)算子该算子是DeepseekV3使用的一种“部分RoPE“算子,即对于输入 x[bs, seq_len, num_heads, head_dim],首先将 head_dim 拆分成 nope_head_dim + pe_head_dim,然后只在 pe_head_dim 上进行RoPE,最后合并回来
注意到,该算子只有一个输入x,因此 q 和 k 需要分别处理;这是模仿DSV3的timeline实现的,因为它的 q 比 k 大很多,分开处理更好实现
PaddleNLP修改已合入:PaddlePaddle/PaddleNLP#10942
(用了分支的形式,旧版的Paddle还能跑老的分支,新版Paddle就自动使用融合算子)
Pcard-85711