-
-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for lecun normal weight initialization #2290
Comments
Could probably use the existing |
would be interested in solving this if its still open. Please elaborate if it is. |
In |
Hi @vortex73, would you be working on this? |
@chiral-carbon Please go ahead and open a PR if you are willing to tackle this. |
@darsnack thanks, will open a PR soon |
@RohitRathore1 were you working on this? I had a PR in the works but will stop @darsnack |
Hi @chiral-carbon, @darsnack. Is this issue empty? Can I start? |
@Bhavay-2001 i had claimed this but then a PR was opened soon after by someone else, so I’m not sure about the status. If this issue opens up again for a new PR I would like to work on it |
Hi @chiral-carbon, I don't know. I have opened a PR but I have not got any comment on it and now I am checking the logs of one GitHub actions then logs are not available. I will have to review it again. |
I don't recall why that PR didn't get comments. Maybe someone was waiting for tests to be in place? Anyhow, that would be my feedback now. We can continue on the PR thread :) |
Motivation and description
Lecun normal initialization is needed (as far as I understand) to properly make self normalized neural networks.
Since Flux provides the selu activation function and alpha dropout, it would be nice to have lecun normal built in as well.
Possible Implementation
Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. (That is from tensorflow website)
The text was updated successfully, but these errors were encountered: