-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Penalty Term Frobenius Norm Squared #4
Comments
The Frobenius norm has a |
Another issue I run into with this code is that the first sum operation reduces the number of dimensions to 2, but the outer sum is then over the no-longer-existing dimension 2. So either the dimensions should be reversed:
or perhaps Also, are you saying the sqrt can be removed as an optimization? |
Sorry for the late reply. You are right, the above code in the first post should raise a dimension mismatch error. Yes. I think the |
In the APPENDIX of your paper, you have mentioned a method called batcheddot, can you show me the detail about batcheddot. |
The "batched_dot" is just the batched_dot() function in Theano.
Please refer to this part if you want to look into implementation details: https://github.com/hantek/SelfAttentiveSentEmbed/blob/master/util_layers.py#L353-L356 For the reason why not directly multiply |
May I recommend:
|
In the code above, the Frobenius Form of the Matrix is calculated as ret, and averaged over batch dimension. However, in the original paper, the norm is squared as the penalty term. Is it intended? Or It does not matter too much I wonder. Thanks!
The text was updated successfully, but these errors were encountered: