Replies: 2 comments 1 reply
-
It does seem an interesting idea in principle, I havn't read about any canonical way of doing it though. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is it very complicated to just clip gradients in the standard update function, e.g.,
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm looking for a way to clip gradients based on their distributions (in a minibatch) at each time step, lowering the norm of only the extreme ones. Basically I want to max out those gradients that lie outside some percentile at a given value (https://en.wikipedia.org/wiki/Winsorizing). I'm surprised this doesn't exist in most frameworks ? Is there a principled way to do this or should I write this myself ? This function https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html has not been implemented in Jax yet.
Thanks in advance,
Mathis
Beta Was this translation helpful? Give feedback.
All reactions