-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RoPE实现细节 #47
Labels
bug
Something isn't working
Comments
你可以看下qw2的实现,负项和正项是分开的。和博客中的公式(13)只是顺序不一样,但整体的结果是一样。 |
# RoPE编码
if self.RoPE:
pos = SinusoidalPositionEmbedding(self.head_size, 'zero')(inputs)
cos_pos = pos[..., 1::2].repeat(1, 1, 2)
sin_pos = pos[..., ::2].repeat(1, 1, 2)
qw2 = torch.stack([-qw[..., 1::2], qw[..., ::2]], 3)
qw2 = torch.reshape(qw2, qw.shape)
qw = qw * cos_pos + qw2 * sin_pos reshape之后不是变成了一负一正交替吗,如果把负项和正项分开的话,公式(13)里面左边的qw是不是也要把奇项和偶项分开才能保证各个位置对齐,最后内积的整体结果不变。 |
经过初步测试,确实存在你说的问题。 感谢你指出的问题,我们会在统一测试后进行修改。 |
我们会在下个版本会修复这个bug,并在commit再次表示感谢 |
Merged
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
大佬你好,你的RoPE在实现上是不是有点问题,按照苏神的博客应该是上面修改后的代码吧
The text was updated successfully, but these errors were encountered: