You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am implementing attention in the following manner, but get sparse attention wts. Only 1-2 elements in the activation vector are non-zero. Anyone facing similar issue or can help me figure out the problem in my approach? Thanks!
PS - I am reducing the dimensionality of global vector g to match with L (384 or 256).
I am implementing attention in the following manner, but get sparse attention wts. Only 1-2 elements in the activation vector are non-zero. Anyone facing similar issue or can help me figure out the problem in my approach? Thanks!
PS - I am reducing the dimensionality of global vector g to match with L (384 or 256).
@raghavsi @vineetm
The text was updated successfully, but these errors were encountered: