Replies: 1 comment
-
Hi,do you how the attention_mask is passed to the middle pp stage now? I am facing this problem now. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I know the hidden_states are the output of previous stage, but I don't understand the how the attention_mask is passed to the next transformer block.
Beta Was this translation helpful? Give feedback.
All reactions