Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 1.55 KB

README.md

File metadata and controls

38 lines (25 loc) · 1.55 KB

neural-net-regression

This is the code used to optimize the weights of an under-parameterized neural network. Training is done via gradient flow using MLPGradientFlow.jl (https://arxiv.org/abs/2301.10638). In this repo, we release the code to train and visualize the result of training for the erf activation function and standard Gaussian input data (https://arxiv.org/abs/2311.01644).

Files

  • This file README.md
  • Simulation file erf50/erf_sims.jl
  • Script to see the loss curves and gradient norms plot-training.py
  • Script to see the summary of training plot-training-summary.py for all widths
  • Script to visualize the weights at convergence plot-results.py
  • Helper functions helper.py

Dependencies

To visualize the results using Python as done in this repo, need to install

  • juliacall
  • numpy
  • matplotlib

Results

We find that gradient flow converges to either one of two minima depending on the direction of initialization when the student width is about one-half of teacher width.

We plot the results for $n=25$ and $k=50$ below.

configs

loss_curves_25stud_50teach

grnoms_25stud_50teach

Jan 15, 2024