Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 1.48 KB

README.md

File metadata and controls

33 lines (24 loc) · 1.48 KB

Double Descent (MLX)

dd

dd

Double Descent: a phenomenon in machine learning where test error initially increases near the interpolation threshold, as model parameters approach the number of samples, then decreases and improves generalization in the highly overparameterized regime.​ The term comes from Belkin et al. (2019) and this repo is inspired by Schaeffer et al. (2023).

Background

We consider the case of modeling $f: x \mapsto y$ where $y = 2x + \cos \left(25x / \sin x\right)$ using polynomial regression. While we could solve this with gradient descent, optimized with SGD or Adam, we observe double descent consistently with the Moore–Penrose inverse (pseudoinverse) solution to ordinary least squares.

Running

Run with default params and save the result in media/polynomial_*.png:

python polynomial.py
  • polynomial.py: training and evaluation loops
  • optimizers.py: sgd, adam, ols (pseudoinverse), ...
  • metrics.py: mse, rmse, r2, ...
  • data.py: generate the dataset

Dependencies

Install the dependencies (optimized for Apple silicon; yay for MLX!):

pip install -r requirements.txt