This repository contains source code for the ICLR 2022 paper How many degrees of freedom do we need to train deep networks: a loss landscape perspective by Brett W. Larsen, Sanislav Fort, Nic Becker, and Surya Ganguli (arXiv version).
This code was developed and tested using JAX v0.1.74
, JAXlib v0.1.52
, and Flax v0.2.0
. The authors intend to update the repository in the future with additional versions of the script that work with the flax.linen
module.
burn_in_subspace.py
: Script for random affine subspace and burn-in affine subspace experiments. To use random affine subspaces, set the parameterinit_iters
to 0.lottery_subspace.py
: Script for lottery subspace experimentslottery_ticket.py
: Script for lottery ticket experiments
architectures.py
: Model filesdata_utils.py
: Functions for saving out datagenerate_data.py
: Functions to setup datasets for traininglogging_tools.py
: Setup for logger; generates automatic experiment name with timestamptraining_utils.py
: Functions related to projecting to and training in a subspace
@inproceedings{LaFoBeGa22,
title={How many degrees of freedom do we need to train deep networks: a loss landscape perspective},
author={Brett W. Larsen and Stanislav Fort and Nic Becker and Surya Ganguli},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=ChMLTGRjFcU}
}