Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGD and LSTM #105

Open
wants to merge 129 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
577e439
The world was created in one day.
maciejkula Dec 26, 2017
4b8d2ed
Update manifest.
maciejkula Dec 26, 2017
af83d51
Update reamde.
maciejkula Dec 26, 2017
caf6fd9
Spelling.
maciejkula Dec 26, 2017
ea6ca06
More spelling.
maciejkula Dec 26, 2017
ff17b0e
Hide ParameterNode fields.
maciejkula Dec 26, 2017
a1b72b8
Spelling.
maciejkula Dec 26, 2017
9b91786
Add scalar sum and ln nodes.
maciejkula Dec 30, 2017
5dbbe69
Add node reuse benchmark.
maciejkula Dec 31, 2017
85f3a29
Easier tests, numerically.
maciejkula Jan 1, 2018
984306c
Merge pull request #1 from maciejkula/ci
maciejkula Jan 1, 2018
1e39796
Start not repeating work.
maciejkula Dec 30, 2017
149f3a0
Finish adding pass marking.
maciejkula Dec 31, 2017
9919338
Add node reuse benchmark.
maciejkula Dec 31, 2017
5e20898
Rebase on master and fix new nodes.
maciejkula Jan 1, 2018
00b66fb
Add instructions on using BLAS.
maciejkula Jan 2, 2018
16688ca
Merge pull request #2 from maciejkula/lazy
maciejkula Jan 2, 2018
99c3a11
Ignore code section in docs.
maciejkula Jan 2, 2018
bbc3a54
Merge pull request #3 from maciejkula/blas_optim
maciejkula Jan 2, 2018
b3d3636
Bump version to 0.3.0.
maciejkula Jan 2, 2018
78cb887
Fix travis badge in Cargo.toml.
maciejkula Jan 2, 2018
5a75d4c
Push 0.3.1.
maciejkula Jan 3, 2018
e64983b
Extract parameters of a graph from its root.
maciejkula Jan 6, 2018
4cf2410
Add stack/concat node.
maciejkula Jan 6, 2018
e710cb5
Add tanh node.
maciejkula Jan 6, 2018
d533bc3
Start adding LSTM cells.
maciejkula Jan 6, 2018
c7599cc
Initial LSTM cell tests.
maciejkula Jan 6, 2018
414330c
Add pi prediction test.
maciejkula Jan 6, 2018
d496c44
Try to fuse the LSTM nodes.
maciejkula Jan 7, 2018
13ece9f
Box nodes to manage type complexity.
maciejkula Jan 7, 2018
ca7e780
Update tests.
maciejkula Jan 7, 2018
e586ffd
Solve problems with zeroing gradients in deep graphs.
maciejkula Jan 8, 2018
d24fa4a
Merge pull request #4 from maciejkula/boxed_nodes
maciejkula Jan 8, 2018
f2c4cf8
Add docs for LSTM layers.
maciejkula Jan 10, 2018
2272edf
Head off potential UB in HogwildParameters.
maciejkula Jan 10, 2018
362d8ea
Merge pull request #5 from maciejkula/lstm_docs
maciejkula Jan 10, 2018
43c6312
Bump to 0.4.0.
maciejkula Jan 10, 2018
6a3078f
Speed up softmax.
maciejkula Jan 13, 2018
d3f980f
Move layers to nn
maciejkula Jan 13, 2018
c4799a9
Add log-softmax node.
maciejkula Jan 13, 2018
dd96662
Add sparse cross entropy loss.
maciejkula Jan 13, 2018
bd5095a
Merge pull request #6 from maciejkula/cross_entropy
maciejkula Jan 13, 2018
485d687
Bump to 0.5.0.
maciejkula Jan 13, 2018
79a0cfc
Improve LSTM layer ergonomics.
maciejkula Jan 13, 2018
2b0914f
Add gemv/gemm dispatch optimizations.
maciejkula Jan 14, 2018
9762e44
Merge pull request #7 from maciejkula/layer_ergonomics
maciejkula Jan 14, 2018
fa9fdb9
Merge pull request #8 from maciejkula/mat_vec_optimizations
maciejkula Jan 14, 2018
6f9f1c5
Bump to 0.6.0.
maciejkula Jan 15, 2018
3014ac9
Add fast-math option to enable fast approximations for transcendental…
maciejkula Jan 17, 2018
69bb643
Merge pull request #9 from maciejkula/fast-math
maciejkula Jan 17, 2018
3ff6944
Bump version.
maciejkula Jan 17, 2018
62a884a
Use more robust approximate log function.
maciejkula Jan 17, 2018
56b44a6
Bump version to 0.7.1.
maciejkula Jan 17, 2018
98bdb4b
Optimize elementwise op nodes.
maciejkula Jan 18, 2018
b99e300
More accurate fastexp function.
maciejkula Jan 19, 2018
c9796ad
Remove unused code.
maciejkula Jan 19, 2018
487d7aa
Merge pull request #10 from maciejkula/optim_elemwise
maciejkula Jan 20, 2018
87246ee
Accumulate gradients in DotNodes.
maciejkula Jan 20, 2018
1999e54
Merge pull request #11 from maciejkula/accumulate_gradients_in_dot
maciejkula Jan 20, 2018
69533de
Bump version.
maciejkula Jan 21, 2018
c4a44cf
Add an adagrad implementation.
maciejkula Jan 25, 2018
81da851
Merge pull request #12 from maciejkula/adagrad
maciejkula Jan 26, 2018
c83e60b
Don't use fast numerics for pow and sigmoid.
maciejkula Feb 26, 2018
6871e45
Fix hogwild param safety.
maciejkula Feb 28, 2018
6df6ce6
Merge pull request #13 from maciejkula/fix_hogwild_safety
maciejkula Mar 1, 2018
d06cb35
Sensibly clone LSTM parameter objects.
maciejkula Mar 4, 2018
2a33542
Merge pull request #14 from maciejkula/clone_for_params
maciejkula Mar 4, 2018
1d318ca
Start removing simd to make things run on stable.
maciejkula Apr 19, 2018
c91a6df
Start working on moving benchmarks out.
maciejkula Apr 19, 2018
e1ae5a8
Continue with benches, add relu, add arith ops for float constants.
maciejkula Apr 21, 2018
2b375eb
Add optional gradient clipping.
maciejkula Apr 22, 2018
bb2060c
Conform to standard agagrad impls.
maciejkula Apr 24, 2018
5563b70
Fix L2 regularization.
maciejkula Apr 25, 2018
d88c597
Fix sub operations.
maciejkula Apr 27, 2018
a48b5d2
Speed up tanh layer.
maciejkula Apr 29, 2018
5b499a0
Fix L2 reg in adagrad.
maciejkula Apr 29, 2018
dc4507d
Switch LSTM init to uniform.
maciejkula Apr 29, 2018
d0919c6
Add synchronization barrier.
maciejkula Apr 30, 2018
178aa50
Integrate Adagrad barrier lock into step.
maciejkula Apr 30, 2018
b9b01df
Add Adam. Fix backprop for sub and add. Add synchronous training.
maciejkula May 4, 2018
8fd7ebd
Add sparse gradient finite difference checking.
maciejkula May 5, 2018
7adb671
Make finite diff tolerance tighter.
maciejkula May 5, 2018
ac9e7ae
Decrease eps for adagrad.
maciejkula May 10, 2018
361b36a
Number of updates in adam now global.
maciejkula May 12, 2018
7fac766
Bump to rand 0.5.
maciejkula May 13, 2018
8b06d56
Make sure to use slice-based ops on arrays.
maciejkula May 13, 2018
42f0812
Get rid of homegrown vec-mat mul impl.
maciejkula May 14, 2018
1eeb298
Modify sparse gradient store.
maciejkula May 14, 2018
2977490
Use get_or_insert_with for dense gradients and grad weights in backward.
maciejkula May 15, 2018
3ad42dd
Add faster array slicing.
maciejkula May 15, 2018
4721f1d
Barrier on entry and exit to parameter update section.
maciejkula May 17, 2018
6bc7203
Move all optimizers into the `optim` module.
maciejkula May 20, 2018
2cd882a
Bump version.
maciejkula May 20, 2018
6c89cf3
Remove mentions of nightly compiler from readme.
maciejkula May 20, 2018
87c7e60
Fix all instances of non-full backprop.
maciejkula May 20, 2018
7ae96e2
Merge pull request #15 from maciejkula/no_simd
maciejkula May 20, 2018
fdfae27
Incorporate clippy suggestions.
maciejkula May 21, 2018
bad114b
Bump rand to 0.5.0.
maciejkula May 22, 2018
bc0099a
Merge pull request #16 from maciejkula/clippy
maciejkula May 22, 2018
6e2dcff
Add slice nodes.
maciejkula May 27, 2018
d11a5c6
Fix compile on 1.25.0.
maciejkula May 27, 2018
3ffe603
Merge pull request #17 from maciejkula/slice_ops
maciejkula May 27, 2018
465eeb3
Bump version to 0.8.1.
maciejkula May 27, 2018
fcfdc78
Make sure sparse indices are order-independent.
maciejkula May 28, 2018
ab17d2d
Major overhaul of optimizers to support true sync parallelism.
maciejkula May 28, 2018
6e13189
Ergonomics fixes for synchronized optimizers.
maciejkula May 29, 2018
239c378
Hashmap-based sparse accumulators.
maciejkula May 29, 2018
6c2e0a9
Add hibitset accumulator.
maciejkula May 30, 2018
220bb17
Use hierarchical bitsets for sparse gradient accumulation.
maciejkula May 30, 2018
cb77c9a
Slight cleanup.
maciejkula May 31, 2018
c9cd44d
First step to removing gradient zeroing.
maciejkula May 31, 2018
6845d08
Eliminate manual zero_gradient calls.
maciejkula May 31, 2018
7ee4bfa
Working deterministic update order.
maciejkula May 31, 2018
43b7817
Clean up optimizer implementation.
maciejkula May 31, 2018
629f94d
Add default impls for optimizers. Add deny_missing_docs.
maciejkula Jun 1, 2018
751ab7e
Add coupled update-forget gates to LSTM.
maciejkula Jun 1, 2018
7bb1b47
Better gradient accumulation in dot nodes.
maciejkula Jun 1, 2018
dcc9e74
Merge pull request #18 from maciejkula/optimizer-overhaul
maciejkula Jun 2, 2018
b2ac5a0
Bump to v0.9.0.
maciejkula Jun 2, 2018
6272ad8
Fix warnings and bump to 0.9.1.
maciejkula Jun 2, 2018
fed37e9
Add missing debug impls.
maciejkula Jun 21, 2018
954b86a
Merge pull request #22 from maciejkula/missing_debug_impls
maciejkula Jun 21, 2018
ec8cf1f
move into amadeus-ml
alecmocatta Aug 7, 2020
23b9bcd
Merge 'wyrm/master'
alecmocatta Aug 7, 2020
6334be0
update deps & clippy clean
alecmocatta Aug 8, 2020
28e3c2e
fix flaky tests
alecmocatta Aug 8, 2020
1e11c78
Merge 'origin/master' into ml
alecmocatta Aug 8, 2020
2e626fd
add ml to CI
alecmocatta Aug 8, 2020
45e319c
fix a stack overflow in CI on windows
alecmocatta Aug 8, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,20 @@ parquet = ["amadeus-parquet", "amadeus-derive/parquet"]
postgres = ["amadeus-postgres", "amadeus-derive/postgres"]
csv = ["amadeus-serde", "amadeus-derive/serde"]
json = ["amadeus-serde", "amadeus-derive/serde"]
ml = ["amadeus-ml"]
ml-openblas = ["amadeus-ml/openblas"]
bench = ["serde-csv", "once_cell", "arrow-parquet", "rayon"]

[package.metadata.docs.rs]
features = ["constellation", "aws", "commoncrawl", "parquet", "postgres", "csv", "json"]
features = ["constellation", "aws", "commoncrawl", "parquet", "postgres", "csv", "json", "ml"]

[dependencies]
amadeus-core = { version = "=0.4.1", path = "amadeus-core" }
amadeus-derive = { version = "=0.4.1", path = "amadeus-derive" }
amadeus-types = { version = "=0.4.1", path = "amadeus-types" }
amadeus-aws = { version = "=0.4.1", path = "amadeus-aws", optional = true }
amadeus-commoncrawl = { version = "=0.4.1", path = "amadeus-commoncrawl", optional = true }
amadeus-ml = { version = "=0.4.1", path = "amadeus-ml", optional = true }
amadeus-parquet = { version = "=0.4.1", path = "amadeus-parquet", optional = true }
amadeus-postgres = { version = "=0.4.1", path = "amadeus-postgres", optional = true }
amadeus-serde = { version = "=0.4.1", path = "amadeus-serde", optional = true }
Expand All @@ -49,7 +52,7 @@ async-channel = "1.1"
bincode = { version = "1.3", optional = true }
constellation-rs = { version = "0.2.0-alpha.2", default-features = false, optional = true }
derive-new = "0.5"
event-listener = "=2.3.1" # https://github.com/stjepang/event-listener/issues/9
event-listener = "2.3.3"
futures = "0.3"
num_cpus = "1.13"
pin-project = "0.4"
Expand Down
46 changes: 46 additions & 0 deletions amadeus-ml/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
[package]
name = "amadeus-ml"
version = "0.4.1"
license = "Apache-2.0"
authors = ["Alec Mocatta <alec@mocatta.net>", "Maciej Kula <maciej.kula@gmail.com>"]
categories = ["data-structures", "algorithms", "science"]
keywords = ["streaming-algorithm", "probabilistic", "sketch", "data-structure", "hyperloglog"]
description = """
A low-overhead, define-by-run autodifferentiation library.
"""
repository = "https://github.com/constellation-rs/amadeus"
homepage = "https://github.com/constellation-rs/amadeus"
documentation = "https://docs.rs/amadeus"
readme = "README.md"
edition = "2018"

[badges]
azure-devops = { project = "alecmocatta/amadeus", pipeline = "tests", build = "26" }
maintenance = { status = "actively-developed" }

[features]
fast-math = []
openblas = ["ndarray/blas", "blas-src/openblas", "openblas-src/static"]

[dependencies]
approx = "0.3"
blas-src = { version = "0.2", optional = true }
hibitset = "0.6"
itertools = "0.9"
ndarray = { version = "0.13", features = ["approx", "serde"] }
openblas-src = { version = "0.6", default-features = false, optional = true }
rand = { version = "0.7", features = ["serde1"] }
rand_distr = "0.2"
serde = { version = "1", features = ["derive", "rc"] }
smallvec = { version = "1.4", features = ["serde"] }

[build-dependencies]
rustversion = "1.0"

[dev-dependencies]
criterion = "0.3"
rayon = "1.3"

[[bench]]
name = "benchmark"
harness = false
10 changes: 10 additions & 0 deletions amadeus-ml/NOTICE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
This product includes software from the wyrm project (MIT)
https://github.com/maciejkula/wyrm

Copyright 2017 Maciej Kula

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
241 changes: 241 additions & 0 deletions amadeus-ml/benches/benchmark.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,241 @@
#![allow(clippy::many_single_char_names)]

use criterion::{criterion_group, criterion_main, Criterion};
use rayon::prelude::*;
use std::sync::Arc;

use amadeus_ml::{
nn::{lstm, xavier_normal}, optim::{Optimizer, SGD}, DataInput, HogwildParameter, ParameterNode
};

fn bench_node_reuse(c: &mut Criterion) {
c.bench_function("node_reuse", |b| {
let dim = 128;

let x = ParameterNode::new(xavier_normal(1, dim));
let y = ParameterNode::new(xavier_normal(dim, 10));
let v = x.dot(&y);
let z = v.clone() + v.clone() + v.clone() + v;

b.iter(|| {
z.forward();
z.zero_gradient();
})
});
}

fn bench_matrix_multiply(c: &mut Criterion) {
c.bench_function("bench_matrix_multiply", |b| {
let dim = 64;
let num_epochs = 20;

let x_data = Arc::new(HogwildParameter::new(xavier_normal(1, dim)));
let y_data = Arc::new(HogwildParameter::new(xavier_normal(dim, 10)));

b.iter(|| {
(0..rayon::current_num_threads())
.into_par_iter()
.for_each(|_| {
let x = ParameterNode::shared(x_data.clone());
let y = ParameterNode::shared(y_data.clone());

let v = x.dot(&y);

for _ in 0..num_epochs {
v.forward();
v.zero_gradient();
}
});
})
});
}

// fn bench_sofmax_exp_sum(b: &mut Criterion) {
// c.bench_function("bench_softmax_exp_sum", |b| {
// let x = vec![1.0; 32];
// let max = 1.0;

// b.iter(|| x.iter().map(|&x| amadeus_ml::exp(x - max)).sum::<f32>().ln())
// })
// }

// #[bench]
// fn bench_sofmax_exp_sum_unrolled(b: &mut Criterion) {
// let x = vec![1.0; 32];
// let max = 1.0;

// b.iter(|| softmax_exp_sum(&x, max).ln())
// }

// fn bench_exp(b: &mut Criterion) {
// let x: Vec<f32> = vec![1.0; 32];

// let mut v = 0.0;

// b.iter(|| x.iter().for_each(|&y| v += y.exp()));
// }

// fn bench_fastexp(b: &mut Criterion) {
// let x: Vec<f32> = vec![1.0; 32];

// let mut v = 0.0;

// b.iter(|| x.iter().for_each(|&y| v += fastexp(y)));
// }

// fn bench_dot(b: &mut Criterion) {
// let xs = vec![0.0; 256];
// let ys = vec![0.0; 256];

// b.iter(|| dot(&xs[..], &ys[..]));
// }

// fn bench_unrolled_dot(b: &mut Criterion) {
// let xs = vec![0.0; 256];
// let ys = vec![0.0; 256];

// b.iter(|| unrolled_dot(&xs[..], &ys[..]));
// }

// fn bench_simd_dot(b: &mut Criterion) {
// let xs = vec![0.0; 256];
// let ys = vec![0.0; 256];

// b.iter(|| simd_dot(&xs[..], &ys[..]));
// }

// fn bench_array_scaled_assign(b: &mut Criterion) {
// let mut xs = random_matrix(256, 1);
// let ys = random_matrix(256, 1);

// b.iter(|| array_scaled_assign(&mut xs, &ys, 3.5));
// }

// fn bench_slice_scaled_assign(b: &mut Criterion) {
// let mut xs = random_matrix(256, 1);
// let ys = random_matrix(256, 1);

// b.iter(|| scaled_assign(&mut xs, &ys, 3.5));
// }

// fn bench_array_assign(b: &mut Criterion) {
// let mut xs = random_matrix(256, 1);
// let ys = random_matrix(256, 1);

// b.iter(|| array_assign(&mut xs, &ys));
// }

// fn bench_slice_assign(b: &mut Criterion) {
// let mut xs = random_matrix(256, 1);
// let ys = random_matrix(256, 1);

// b.iter(|| assign(&mut xs, &ys));
// }

// fn dot_node_specializations_mm(b: &mut Criterion) {
// let x = random_matrix(64, 64);
// let y = random_matrix(64, 64);
// let mut z = random_matrix(64, 64);

// b.iter(|| mat_mul(1.0, &x, &y, 0.0, &mut z));
// }

// fn dot_node_general_vm(b: &mut Criterion) {
// let x = random_matrix(1, 64);
// let y = random_matrix(64, 64);
// let mut z = random_matrix(1, 64);

// b.iter(|| general_mat_mul(1.0, &x, &y, 0.0, &mut z));
// }

// fn dot_node_specializations_vm(b: &mut Criterion) {
// let x = random_matrix(1, 64);
// let y = random_matrix(64, 64);
// let mut z = random_matrix(1, 64);

// b.iter(|| mat_mul(1.0, &x, &y, 0.0, &mut z));
// }

// fn dot_node_specializations_mv(b: &mut Criterion) {
// let x = random_matrix(64, 64);
// let y = random_matrix(64, 1);
// let mut z = random_matrix(64, 1);

// b.iter(|| mat_mul(1.0, &x, &y, 0.0, &mut z));
// }

// fn dot_node_general_mv(b: &mut Criterion) {
// let x = random_matrix(64, 64);
// let y = random_matrix(64, 1);
// let mut z = random_matrix(64, 1);

// b.iter(|| general_mat_mul(1.0, &x, &y, 0.0, &mut z));
// }

fn pi_digits(num: usize) -> Vec<usize> {
let pi_str = include_str!("../src/nn/pi.txt");
pi_str
.chars()
.filter_map(|x| x.to_digit(10))
.map(|x| x as usize)
.take(num)
.collect()
}

fn bench_lstm(c: &mut Criterion) {
c.bench_function("bench_lstm", |b| {
let sequence_length = 4;
let num_digits = 10;
let input_dim = 16;
let hidden_dim = 32;

let lstm_params = lstm::Parameters::new(input_dim, hidden_dim, &mut rand::thread_rng());
let lstm = lstm_params.build();

let final_layer = amadeus_ml::ParameterNode::new(xavier_normal(hidden_dim, num_digits));
let embeddings = amadeus_ml::ParameterNode::new(xavier_normal(num_digits, input_dim));
let y = amadeus_ml::IndexInputNode::new(&[0]);

let inputs: Vec<_> = (0..sequence_length)
.map(|_| amadeus_ml::IndexInputNode::new(&[0]))
.collect();
let embeddings: Vec<_> = inputs
.iter()
.map(|input| embeddings.index(&input))
.collect();

let hidden_states = lstm.forward(&embeddings);
let hidden = hidden_states.last().unwrap();

let prediction = hidden.dot(&final_layer);
let mut loss = amadeus_ml::nn::losses::sparse_categorical_crossentropy(&prediction, &y);
let optimizer = SGD::new();

let digits = pi_digits(100);

b.iter(|| {
for i in 0..(digits.len() - sequence_length - 1) {
let digit_chunk = &digits[i..(i + sequence_length + 1)];
if digit_chunk.len() < sequence_length + 1 {
break;
}

for (&digit, input) in digit_chunk[..digit_chunk.len() - 1].iter().zip(&inputs) {
input.set_value(digit);
}

let target_digit = *digit_chunk.last().unwrap();
y.set_value(target_digit);

loss.forward();
loss.backward(1.0);

optimizer.step(loss.parameters());
loss.zero_gradient();
}
})
});
}

criterion_group!(benches, bench_node_reuse, bench_matrix_multiply, bench_lstm);
criterion_main!(benches);
Loading