Project 2 Issue : Getting Sparse Attention Wts #50

prateek27 · 2018-11-12T18:42:44Z

I am implementing attention in the following manner, but get sparse attention wts. Only 1-2 elements in the activation vector are non-zero. Anyone facing similar issue or can help me figure out the problem in my approach? Thanks!
PS - I am reducing the dimensionality of global vector g to match with L (384 or 256).

@raghavsi @vineetm

raghavsi · 2018-11-13T05:08:33Z

What are the nonlinearity on g/l?

prateek27 · 2018-11-13T06:14:12Z

ReLu Alexnet without attention working pretty well.

…

raghavsi · 2018-11-14T04:59:36Z

The layer before B when you convert Conv4 to FC, what is the activation there?
B you have softmax right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project 2 Issue : Getting Sparse Attention Wts #50

Project 2 Issue : Getting Sparse Attention Wts #50

prateek27 commented Nov 12, 2018 •

edited

Loading

raghavsi commented Nov 13, 2018

prateek27 commented Nov 13, 2018 via email

raghavsi commented Nov 14, 2018

Project 2 Issue : Getting Sparse Attention Wts #50

Project 2 Issue : Getting Sparse Attention Wts #50

Comments

prateek27 commented Nov 12, 2018 • edited Loading

raghavsi commented Nov 13, 2018

prateek27 commented Nov 13, 2018 via email

raghavsi commented Nov 14, 2018

prateek27 commented Nov 12, 2018 •

edited

Loading