Conditional RBMs #9

rofinn · 2015-08-06T15:40:23Z

CRBMs seem like a simple addition to allow the Boltzmann.jl package to work on temporal datasets. From the paper it looks like we'd just need to add support for using the visible vectors from the previous timestep(s) as additional biases. Would anyone else be interested?

http://www.cs.toronto.edu/~fritz/absps/fcrbm_icml.pdf

dfdx · 2015-08-06T21:44:24Z

Sounds interesting. I'll take a look at it on the weekend.

dfdx · 2015-08-10T10:57:55Z

Seems like adding CRBM will require quite some effort - at least fitting procedure will need to be updated to process items in a way where order does matter. I started some refactoring which will make adding temporal / conditional models easier.

But the big questions is about proper interface. I imagine fit function for CRBM having the same signature as for other RBMs, but processing data matrix column by column, assuming that each next column is an observation of the same phenomenon, but in time t+1. Does it conform to your use case? If not, is there a better way to design fit function?

rofinn · 2015-11-25T22:43:56Z

Sorry, for the very late response :( Yeah, that seems fine for now. I might want to still support batched fitting in the future, but I think I'm just interested in trying to get something simple working. I should have some time to work on this over the next week or two and I'll post back when I have something (hopefully not too buggy as I'm still kind of new to RBMs).

dfdx · 2015-11-26T07:47:02Z

Thanks for coming back. Please, feel free to involve me - I don't know CRBMs very well, but might be helpful in general RBM concepts.

rofinn · 2015-11-28T00:18:52Z

Alright, I've got some CRBM code working here if anyone wants to take a look. The current implementation has a lot of duplicate code with rbm.jl that could be refactored.

I opted for the same input data format as the base RBM and just have each column contain all visible patterns from v_t to v_t-n concatenated together. When necessary I just use sub to separate v_t from the history (v_t-1 to v_t-n).

I was considering adding a predict method that takes some history and generates v_t. Thoughts?

dfdx · 2015-11-28T22:30:42Z

Looks great!

Yes, predict() would be nice addition. Just make sure to extend StatsBase.predict() function to avoid possible conflicts.

rofinn · 2015-11-28T23:18:25Z

Will do.

rofinn · 2015-11-30T20:10:59Z

Alright, I've added predict and I don't see any glaring typos. If you want I can either make a pull request now and we can do some refactoring of rbm.jl and conditional.jl after or I can work some refactoring into the pull request. I've just been hesitant to do to much editing of rbm.jl to avoid breaking the existing functionality.

dfdx · 2015-11-30T21:05:21Z

I think it's better to make pull request now and refactoring some time later. The reason I don't want to hurry up with refactoring is that I'm working on a major and probably breaking change in API. In short, I'm going to add GPU support to the default RBM, which will make both - RBMs for sparse data and Conditional RBMs - somewhat out of context, so we will need to rethink the whole RBM hierarchy. Thus we can save some time and delay refactoring until new design is clear.

rofinn · 2015-11-30T21:06:36Z

Alright, sounds good.

rofinn · 2016-01-09T22:54:37Z

Hi, I was just wondering if you've had any time to think about the refactoring? I'd be happy to help if you have a rough idea of what you'd like the API to look like. I have a few more features I'd like to add (like a sparsity constraint), but I imagine they should probably wait.

dfdx · 2016-01-09T23:44:16Z

Hi. I'm still experimenting with possible options for refactoring, but good news is that most likely it won't be as global as I expected and it should be safe to merge code for normal and conditional RBMs now. So if you have time, please go ahead!

Also it will be really nice if you add sparsity constraint - this is one of the features I always wanted, but never had time/need to implement myself.

rofinn · 2016-01-11T08:04:52Z

Alright, as far as I can tell I've got basic sparsity and weight decay penalties working here. I'll update conditional.jl and submit a PR, if you're okay with how I've implemented them.

eric-tramel · 2016-01-11T13:49:02Z

Hi @Rory-Finnegan, @dfdx , thanks for continuing to work on the Boltzmann.jl package. As you know, our fork ( sphinxteam/Boltzmann.jl ) has made a number of large changes, refactoring, etc. Please feel free to adopt any changes you want into this main repository.

We were working on a number of different research avenues, so we didn't really stop to take the time to propose any pull requests into dfdx/Boltzmann.jl . Specifically, take a look at the monitoring functions we've added, we really found these to be useful for diagnosis and tuning in practice. We still have some "new" things, specifically the EMF learning from our NIPS paper, but the P/CD implementations should be good, too.

rofinn · 2016-01-11T19:26:05Z

@eric-tramel Cool, I stopped by your poster at NIPS, but didn't realize you were using julia for it. I think adding some of the monitoring code (maybe with Gadfly/Plots) and integrating in the EMF learning method would help make things more research friendly and help drive some refactoring.

Looking through your code I think I'd like to refactor boltzmann to have a cleaner way for handling different common penalties and regularizations. For example, there are different approaches for doing weight decay, sparsity etc which should probably be decoupled from the RBM type and the weight_update! method. Ideally, I'd like these penalty to be closures (possibly user defined) that you can dispatch on, but since the dispatching on custom function types behaviour probably won't be added to julia till 0.6 maybe just a type with a singe required method...?

dfdx · 2016-01-11T20:13:40Z

I think we can decouple these things without Julia's dispatching and carefully measure performance. Basically, I don't expect function invocation to be a bottleneck - in most cases we should worry about matrix multiplication instead.

I will play around with possible solutions today and see how to put it all together.

dfdx · 2016-01-11T22:13:46Z

@Rory-Finnegan All in all, your changes (sparsity and weight decay) look fine for me. Could you please make a pull request so we could close this issue and discuss possible refactoring separately?

rofinn · 2016-01-12T05:39:23Z

Sure. PR #20 opened.

dfdx mentioned this issue Jan 11, 2016

Refactoring #19

Closed

rofinn closed this as completed Jan 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional RBMs #9

Conditional RBMs #9

rofinn commented Aug 6, 2015

dfdx commented Aug 6, 2015

dfdx commented Aug 10, 2015

rofinn commented Nov 25, 2015

dfdx commented Nov 26, 2015

rofinn commented Nov 28, 2015

dfdx commented Nov 28, 2015

rofinn commented Nov 28, 2015

rofinn commented Nov 30, 2015

dfdx commented Nov 30, 2015

rofinn commented Nov 30, 2015

rofinn commented Jan 9, 2016

dfdx commented Jan 9, 2016

rofinn commented Jan 11, 2016

eric-tramel commented Jan 11, 2016

rofinn commented Jan 11, 2016

dfdx commented Jan 11, 2016

dfdx commented Jan 11, 2016

rofinn commented Jan 12, 2016

Conditional RBMs #9

Conditional RBMs #9

Comments

rofinn commented Aug 6, 2015

dfdx commented Aug 6, 2015

dfdx commented Aug 10, 2015

rofinn commented Nov 25, 2015

dfdx commented Nov 26, 2015

rofinn commented Nov 28, 2015

dfdx commented Nov 28, 2015

rofinn commented Nov 28, 2015

rofinn commented Nov 30, 2015

dfdx commented Nov 30, 2015

rofinn commented Nov 30, 2015

rofinn commented Jan 9, 2016

dfdx commented Jan 9, 2016

rofinn commented Jan 11, 2016

eric-tramel commented Jan 11, 2016

rofinn commented Jan 11, 2016

dfdx commented Jan 11, 2016

dfdx commented Jan 11, 2016

rofinn commented Jan 12, 2016