neural-net-regression

This is the code used to optimize the weights of an under-parameterized neural network. Training is done via gradient flow using MLPGradientFlow.jl (https://arxiv.org/abs/2301.10638). In this repo, we release the code to train and visualize the result of training for the erf activation function and standard Gaussian input data (https://arxiv.org/abs/2311.01644).

Files

This file README.md
Simulation file erf50/erf_sims.jl
Script to see the loss curves and gradient norms plot-training.py
Script to see the summary of training plot-training-summary.py for all widths
Script to visualize the weights at convergence plot-results.py
Helper functions helper.py

Dependencies

To visualize the results using Python as done in this repo, need to install

juliacall
numpy
matplotlib

Results

We find that gradient flow converges to either one of two minima depending on the direction of initialization when the student width is about one-half of teacher width.

We plot the results for $n=25$ and $k=50$ below.

Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

neural-net-regression

Files

Dependencies

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

neural-net-regression

Files

Dependencies

Results