Generalized constraints: post update hooks #1214

albertz · 2022-11-14T09:49:10Z

Currently our implemented constraints are:

L2 on weights (L2 option on a layer)
Some exotic things on activations (darc1, spatial_smoothing)

We already have the possibility to decouple the constraints from the normal loss computation, via decouple_constraints. In #1206, this behavior will change a bit, and then it decouples only the data-independent constraints, i.e. namely only L2 currently.

L2 is equivalent to weight decay when SGD is used. With the new decoupled constraints code (#1206), it explicitly does:

                return var.assign_sub(var * (l2 * 2.), use_locking=self.use_locking, read_value=False)

We can generalize such updates, and allow the user to perform some generic post updates on parameters.

For example, in rwth-i6/returnn_common#241 it was suggested to extend L2 to have some decay_center. But instead of having such a L2-specific additional option, we can allow the user to perform any custom post updates, similar as the code above. Then the user could easily do such delay_center logic, but also many other things as well.

Also related: rwth-i6/returnn_common#90

How would the API look like on RETURNN side? It's maybe also ok to only do this for the VariableLayer.

The text was updated successfully, but these errors were encountered:

This was referenced Nov 14, 2022

How to have custom updates for parameters rwth-i6/returnn_common#90

Open

Weight decay API maybe unintuitive rwth-i6/returnn_common#241

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalized constraints: post update hooks #1214

Generalized constraints: post update hooks #1214

albertz commented Nov 14, 2022

Generalized constraints: post update hooks #1214

Generalized constraints: post update hooks #1214

Comments

albertz commented Nov 14, 2022