You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some exotic things on activations (darc1, spatial_smoothing)
We already have the possibility to decouple the constraints from the normal loss computation, via decouple_constraints. In #1206, this behavior will change a bit, and then it decouples only the data-independent constraints, i.e. namely only L2 currently.
L2 is equivalent to weight decay when SGD is used. With the new decoupled constraints code (#1206), it explicitly does:
We can generalize such updates, and allow the user to perform some generic post updates on parameters.
For example, in rwth-i6/returnn_common#241 it was suggested to extend L2 to have some decay_center. But instead of having such a L2-specific additional option, we can allow the user to perform any custom post updates, similar as the code above. Then the user could easily do such delay_center logic, but also many other things as well.
Currently our implemented constraints are:
L2
option on a layer)darc1
,spatial_smoothing
)We already have the possibility to decouple the constraints from the normal loss computation, via
decouple_constraints
. In #1206, this behavior will change a bit, and then it decouples only the data-independent constraints, i.e. namely only L2 currently.L2 is equivalent to weight decay when SGD is used. With the new decoupled constraints code (#1206), it explicitly does:
We can generalize such updates, and allow the user to perform some generic post updates on parameters.
For example, in rwth-i6/returnn_common#241 it was suggested to extend L2 to have some
decay_center
. But instead of having such a L2-specific additional option, we can allow the user to perform any custom post updates, similar as the code above. Then the user could easily do suchdelay_center
logic, but also many other things as well.Also related: rwth-i6/returnn_common#90
How would the API look like on RETURNN side? It's maybe also ok to only do this for the
VariableLayer
.The text was updated successfully, but these errors were encountered: