Rotary Embeddings in Encoder #52
Unanswered
conceptofmind
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @Sengxian,
I was wondering how you handled issues relating to length extrapolation for Rotary Embeddings and relative positional encoding for the encoder? From my understanding, Rotary Embeddings only work properly for the decoder since they do not length-generalize that well.
Thank you,
Enrico
Beta Was this translation helpful? Give feedback.
All reactions