Refactoring #22

dfdx · 2016-02-29T23:21:47Z

Main changes:

Fitting process is split into 2 clear parts - gradient calculation and its application to parameters. By default, for this purpose functions gradient_classic() and update_classic!() are used.
Classic updater is split into basic update_weights! and a number of functions modifying gradient, including those for weight decay (L1 and L2), sparsity, learning rate, etc. All such functions have the same signature, so one can easily combine them in an arbitrary way.
Everything is configurable. Most functions take a "context" (ctx) - a dictionary used for holding configuration and buffers. We don't need to pass 3-5 options + buffers to each function anymore, now they all are combined into a context.
In addition to configurable gradient and updater functions, user can now provide (via context) a scorer for measuring model "fitness" (by default, pseudo-likelihood is used) and a reporter to report progress during training (by default, TextReporter is used which simply prints current epoch, score and elapsed time).
The package now supports both - Float64 and Float32. Support for other (real?) data types may be provided given that BLAS functions are defined for them.
API is mostly the same, little to no effort should be needed to start using the new version.

Two things left outside this refactoring. One is EMF approximation by @sphinxteam which seems like a perfect option for gradient calculation. Corresponding paper has a little bit too much physics for me, but if the team finds it possible to transform their work into this repository, it will be a great advantage for the whole Julia stats/deep learning community.

Another thing that I unsuccessfully tried to borrow is visual monitor from the same fork. Unfortunately, in its current state it's closely bound to TAP free energy, which makes it quite hard to use as is. Yet, it's quite possible I will include something similar later.

Comments and criticism are highly welcome.

CC: @Rory-Finnegan @eric-tramel

rofinn · 2016-03-01T01:59:07Z

src/conditional.jl

-
-function split_vis(crbm::ConditionalRBM, vis::Mat{Float64})
+                        steps=5, sigma=0.01)
+    ConditionalRBM(Float64, Bernoulli, Bernoulli, n_vis, n_hid, steps;


I think you want this to be ConditionalRBM(Float64, V, H, n_vis, n_hid, ...? If so I can change that when I add my generalization changes.

Correct, thanks!

Rather than assuming the conditioned patterns will be the visible units at previous time steps, we just take a size to condition on and provide a constructor that takes the number of steps as a feature. With the above change the split_vis method has changed and all references to `history` or `hist` have been changed to `cond`. Also, fixed an issue on line 51 which was passing `Bernoulli` instead of V and H parametrics to the more general constructor.

Generalized the conditional rbm.

rofinn · 2016-03-02T15:59:07Z

src/rbm.jl

    end
    return samples
 end


-function sample_hiddens{V,H}(rbm::AbstractRBM{V, H}, vis::Mat{Float64})
+function sample_hiddens{T,V,H}(rbm::AbstractRBM{T,V,H}, vis::Mat{T})


Might want to accept a vis matrix of typeF<:AbstractFloat and do a convert(Array{T}, vis) if F != T

If we agree on implicit conversion, I don't see a reason to limit T to be an instance of AbstractFloat - one may be interested in passing matrix of Ints or even Boolean, which is very natural for Bernoulli RBM.

…issues. 1. In the sparsity calculation for conditional rbms `hid_means` takes just the visible input provided by `split_vis(rbm, X)`. 2. The `gemm!` calls in gradient_classic for the conditional weights weren't using the appropriate precision (ie: Float32 w/ sparsity) 3. The `free_energy` function will often produces NaNs do to `log(0)` or lack of precision.

Fixed sparsity calculation for conditional rbm and several precision …

Refactoring

dfdx added 17 commits January 13, 2016 23:13

saving

fee1990

general idease

ee75009

classic gradient

945dcf4

new updaters

d40d771

experiments with element type as parameter

89e4398

persistent CD, updates for biases

ae2e0cc

ordinary RBM

e62d016

renamed MeanDistr to Degenerate

4407dae

weight decay

3678685

L1 and L2 weight decay

f001ec5

reporter

089ecd5

ordinary RBMs are back

073e8d3

started refactoring of conditional rbm

4a14bf7

saving

0b2ff8e

conditional RBMs are runnable now

ed028dc

sparsity and weight decay for conditional RBM

7564217

minus several bugs

ebd42eb

dfdx mentioned this pull request Feb 29, 2016

Refactoring #19

Closed

rofinn reviewed Mar 1, 2016
View reviewed changes

rofinn and others added 2 commits February 29, 2016 21:26

Merge pull request #23 from Rory-Finnegan/generic-conditional

033d6f4

Generalized the conditional rbm.

rofinn reviewed Mar 2, 2016
View reviewed changes

dfdx and others added 6 commits March 4, 2016 02:10

warnings for unknown and depecated options

ae9567c

batch randomization

f8dd988

Merge pull request #24 from Rory-Finnegan/refactoring

717a9fc

Fixed sparsity calculation for conditional rbm and several precision …

converting batches to arrays of comatible type

d74199d

initial context as copy of options

7ef6f76

dfdx added a commit that referenced this pull request Mar 7, 2016

Merge pull request #22 from dfdx/refactoring

2dc5d15

Refactoring

dfdx merged commit 2dc5d15 into master Mar 7, 2016

dfdx deleted the refactoring branch June 17, 2017 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring #22

Refactoring #22

dfdx commented Feb 29, 2016

rofinn Mar 1, 2016

dfdx Mar 1, 2016

rofinn Mar 2, 2016

dfdx Mar 3, 2016

Refactoring #22

Refactoring #22

Conversation

dfdx commented Feb 29, 2016

rofinn Mar 1, 2016

Choose a reason for hiding this comment

dfdx Mar 1, 2016

Choose a reason for hiding this comment

rofinn Mar 2, 2016

Choose a reason for hiding this comment

dfdx Mar 3, 2016

Choose a reason for hiding this comment