You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a basic case (ext_factor is 0), the theta uses for cos/sin is scaled by freq_scale * freq_scale. I think this is wrong and this line should be deleted.
value passed to int64_t i0 is wrong: (data type does not matches, either.)
I am using YaRN when implementing DeepSeek V2 models. And current YaRN does not look good me.
@cebtenzzre Could you take a look on this? Correct me if I am wrong.
theta_base *= freq_scale
is done again later inrope_yarn
:https://github.com/ggerganov/ggml/blob/0cbb7c0e053f5419cfbebb46fbf4d4ed60182cf5/src/ggml.c#L14077
In a basic case (
ext_factor
is 0), thetheta
uses forcos/sin
is scaled byfreq_scale * freq_scale
. I think this is wrong and this line should be deleted.value passed to
int64_t i0
is wrong: (data type does not matches, either.)https://github.com/ggerganov/ggml/blob/0cbb7c0e053f5419cfbebb46fbf4d4ed60182cf5/src/ggml.c#L14082-L14088
I think it should be
ic
here.(Confirmed when implementing DeepSeek V2 models)
The text was updated successfully, but these errors were encountered: