Double Descent (MLX)

Double Descent: a phenomenon in machine learning where test error initially increases near the interpolation threshold, as model parameters approach the number of samples, then decreases and improves generalization in the highly overparameterized regime. The term comes from Belkin et al. (2019) and this repo is inspired by Schaeffer et al. (2023).

Background

We consider the case of modeling $f: x \mapsto y$ where $y = 2x + \cos \left(25x / \sin x\right)$ using polynomial regression. While we could solve this with gradient descent, optimized with SGD or Adam, we observe double descent consistently with the Moore–Penrose inverse (pseudoinverse) solution to ordinary least squares.

Running

Run with default params and save the result in media/polynomial_*.png:

python polynomial.py

polynomial.py: training and evaluation loops
optimizers.py: sgd, adam, ols (pseudoinverse), ...
metrics.py: mse, rmse, r2, ...
data.py: generate the dataset

Dependencies

Install the dependencies (optimized for Apple silicon; yay for MLX!):

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Double Descent (MLX)

Background

Running

Dependencies

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
media		media
.gitignore		.gitignore
README.md		README.md
data.py		data.py
metrics.py		metrics.py
optimizers.py		optimizers.py
polynomial.py		polynomial.py
requirements.txt		requirements.txt

stockeh/mlx-double-descent

Folders and files

Latest commit

History

Repository files navigation

Double Descent (MLX)

Background

Running

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages