The High-Dimensional Geometry of Binary Neural Networks

This repository contains code to reproduce the results in the following paper:

Alexander G. Anderson, Cory P. Berg

Abstract

Recent research has shown that one can train a neural network with binary weights and activations at train time by augmenting the weights with a high-precision continuous latent variable that accumulates small changes from stochastic gradient descent. However, there is a dearth of work to explain why one can effectively capture the features in data with binary weights and activations. Our main result is that the neural networks with binary weights and activations trained using the method of Courbariaux, Hubara et al. (2016) work because of the high-dimensional geometry of binary vectors. In particular, the ideal continuous vectors that extract out features in the intermediate representations of these BNNs are well-approximated by binary vectors in the sense that dot products are approximately preserved. Compared to previous research that demonstrated good classification performance with BNNs, our work explains why these BNNs work in terms of HD geometry. Furthermore, the results and analysis used on BNNs are shown to generalize to neural networks with ternary weights and activations. Our theory serves as a foundation for understanding not only BNNs but a variety of methods that seek to compress traditional neural networks. Furthermore, a better understanding of multilayer binary neural networks serves as a starting point for generalizing BNNs to other neural network architectures such as recurrent neural networks.

Quickstart:

Download the relevant datasets. From the data directory, run:

python download_data.py --data_set MNIST

python download_data.py --data_set cifar-10

(Optional) Run a quick test: python bnn_test.py

Run a longer test: python run_bnn.py --debug (trains a network on a small fraction of CIFAR10)

Train a full network: python run_bnn.py
Visualize a trained network: jupyter notebook gen_plots.ipynb
Analyze quantization of HD vectors assuming a Gaussian distribution: juypter notebook hd_figures.ipynb

Be sure to choose the appropriate path for loading the model.

[Note if the parameters are changed in run_bnn.py, they need to be changed in the visualization script as well.]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.gitignore		.gitignore
README.md		README.md
bnn.py		bnn.py
bnn_test.py		bnn_test.py
gen_plots.ipynb		gen_plots.ipynb
graph_ops.py		graph_ops.py
hd_figures.ipynb		hd_figures.ipynb
plotting.py		plotting.py
run_bnn.py		run_bnn.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The High-Dimensional Geometry of Binary Neural Networks

Quickstart:

About

Releases

Packages

Languages

alexanderganderson/bnn_vis

Folders and files

Latest commit

History

Repository files navigation

The High-Dimensional Geometry of Binary Neural Networks

Quickstart:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages