Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional RBMs #9

Closed
rofinn opened this issue Aug 6, 2015 · 18 comments
Closed

Conditional RBMs #9

rofinn opened this issue Aug 6, 2015 · 18 comments

Comments

@rofinn
Copy link
Contributor

rofinn commented Aug 6, 2015

CRBMs seem like a simple addition to allow the Boltzmann.jl package to work on temporal datasets. From the paper it looks like we'd just need to add support for using the visible vectors from the previous timestep(s) as additional biases. Would anyone else be interested?

http://www.cs.toronto.edu/~fritz/absps/fcrbm_icml.pdf

@dfdx
Copy link
Owner

dfdx commented Aug 6, 2015

Sounds interesting. I'll take a look at it on the weekend.

@dfdx
Copy link
Owner

dfdx commented Aug 10, 2015

Seems like adding CRBM will require quite some effort - at least fitting procedure will need to be updated to process items in a way where order does matter. I started some refactoring which will make adding temporal / conditional models easier.

But the big questions is about proper interface. I imagine fit function for CRBM having the same signature as for other RBMs, but processing data matrix column by column, assuming that each next column is an observation of the same phenomenon, but in time t+1. Does it conform to your use case? If not, is there a better way to design fit function?

@rofinn
Copy link
Contributor Author

rofinn commented Nov 25, 2015

Sorry, for the very late response :( Yeah, that seems fine for now. I might want to still support batched fitting in the future, but I think I'm just interested in trying to get something simple working. I should have some time to work on this over the next week or two and I'll post back when I have something (hopefully not too buggy as I'm still kind of new to RBMs).

@dfdx
Copy link
Owner

dfdx commented Nov 26, 2015

Thanks for coming back. Please, feel free to involve me - I don't know CRBMs very well, but might be helpful in general RBM concepts.

@rofinn
Copy link
Contributor Author

rofinn commented Nov 28, 2015

Alright, I've got some CRBM code working here if anyone wants to take a look. The current implementation has a lot of duplicate code with rbm.jl that could be refactored.

I opted for the same input data format as the base RBM and just have each column contain all visible patterns from vt to vt-n concatenated together. When necessary I just use sub to separate vt from the history (vt-1 to vt-n).

I was considering adding a predict method that takes some history and generates vt. Thoughts?

@dfdx
Copy link
Owner

dfdx commented Nov 28, 2015

Looks great!

Yes, predict() would be nice addition. Just make sure to extend StatsBase.predict() function to avoid possible conflicts.

@rofinn
Copy link
Contributor Author

rofinn commented Nov 28, 2015

Will do.

@rofinn
Copy link
Contributor Author

rofinn commented Nov 30, 2015

Alright, I've added predict and I don't see any glaring typos. If you want I can either make a pull request now and we can do some refactoring of rbm.jl and conditional.jl after or I can work some refactoring into the pull request. I've just been hesitant to do to much editing of rbm.jl to avoid breaking the existing functionality.

@dfdx
Copy link
Owner

dfdx commented Nov 30, 2015

I think it's better to make pull request now and refactoring some time later. The reason I don't want to hurry up with refactoring is that I'm working on a major and probably breaking change in API. In short, I'm going to add GPU support to the default RBM, which will make both - RBMs for sparse data and Conditional RBMs - somewhat out of context, so we will need to rethink the whole RBM hierarchy. Thus we can save some time and delay refactoring until new design is clear.

@rofinn
Copy link
Contributor Author

rofinn commented Nov 30, 2015

Alright, sounds good.

@rofinn
Copy link
Contributor Author

rofinn commented Jan 9, 2016

Hi, I was just wondering if you've had any time to think about the refactoring? I'd be happy to help if you have a rough idea of what you'd like the API to look like. I have a few more features I'd like to add (like a sparsity constraint), but I imagine they should probably wait.

@dfdx
Copy link
Owner

dfdx commented Jan 9, 2016

Hi. I'm still experimenting with possible options for refactoring, but good news is that most likely it won't be as global as I expected and it should be safe to merge code for normal and conditional RBMs now. So if you have time, please go ahead!

Also it will be really nice if you add sparsity constraint - this is one of the features I always wanted, but never had time/need to implement myself.

@rofinn
Copy link
Contributor Author

rofinn commented Jan 11, 2016

Alright, as far as I can tell I've got basic sparsity and weight decay penalties working here. I'll update conditional.jl and submit a PR, if you're okay with how I've implemented them.

@eric-tramel
Copy link
Collaborator

Hi @Rory-Finnegan, @dfdx , thanks for continuing to work on the Boltzmann.jl package. As you know, our fork ( sphinxteam/Boltzmann.jl ) has made a number of large changes, refactoring, etc. Please feel free to adopt any changes you want into this main repository.

We were working on a number of different research avenues, so we didn't really stop to take the time to propose any pull requests into dfdx/Boltzmann.jl . Specifically, take a look at the monitoring functions we've added, we really found these to be useful for diagnosis and tuning in practice. We still have some "new" things, specifically the EMF learning from our NIPS paper, but the P/CD implementations should be good, too.

@rofinn
Copy link
Contributor Author

rofinn commented Jan 11, 2016

@eric-tramel Cool, I stopped by your poster at NIPS, but didn't realize you were using julia for it. I think adding some of the monitoring code (maybe with Gadfly/Plots) and integrating in the EMF learning method would help make things more research friendly and help drive some refactoring.

Looking through your code I think I'd like to refactor boltzmann to have a cleaner way for handling different common penalties and regularizations. For example, there are different approaches for doing weight decay, sparsity etc which should probably be decoupled from the RBM type and the weight_update! method. Ideally, I'd like these penalty to be closures (possibly user defined) that you can dispatch on, but since the dispatching on custom function types behaviour probably won't be added to julia till 0.6 maybe just a type with a singe required method...?

@dfdx
Copy link
Owner

dfdx commented Jan 11, 2016

I think we can decouple these things without Julia's dispatching and carefully measure performance. Basically, I don't expect function invocation to be a bottleneck - in most cases we should worry about matrix multiplication instead.

I will play around with possible solutions today and see how to put it all together.

@dfdx
Copy link
Owner

dfdx commented Jan 11, 2016

@Rory-Finnegan All in all, your changes (sparsity and weight decay) look fine for me. Could you please make a pull request so we could close this issue and discuss possible refactoring separately?

@dfdx dfdx mentioned this issue Jan 11, 2016
@rofinn
Copy link
Contributor Author

rofinn commented Jan 12, 2016

Sure. PR #20 opened.

@rofinn rofinn closed this as completed Jan 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants