-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditional RBMs #9
Comments
Sounds interesting. I'll take a look at it on the weekend. |
Seems like adding CRBM will require quite some effort - at least fitting procedure will need to be updated to process items in a way where order does matter. I started some refactoring which will make adding temporal / conditional models easier. But the big questions is about proper interface. I imagine |
Sorry, for the very late response :( Yeah, that seems fine for now. I might want to still support batched fitting in the future, but I think I'm just interested in trying to get something simple working. I should have some time to work on this over the next week or two and I'll post back when I have something (hopefully not too buggy as I'm still kind of new to RBMs). |
Thanks for coming back. Please, feel free to involve me - I don't know CRBMs very well, but might be helpful in general RBM concepts. |
Alright, I've got some CRBM code working here if anyone wants to take a look. The current implementation has a lot of duplicate code with rbm.jl that could be refactored. I opted for the same input data format as the base RBM and just have each column contain all visible patterns from vt to vt-n concatenated together. When necessary I just use I was considering adding a |
Looks great! Yes, |
Will do. |
Alright, I've added |
I think it's better to make pull request now and refactoring some time later. The reason I don't want to hurry up with refactoring is that I'm working on a major and probably breaking change in API. In short, I'm going to add GPU support to the default RBM, which will make both - RBMs for sparse data and Conditional RBMs - somewhat out of context, so we will need to rethink the whole RBM hierarchy. Thus we can save some time and delay refactoring until new design is clear. |
Alright, sounds good. |
Hi, I was just wondering if you've had any time to think about the refactoring? I'd be happy to help if you have a rough idea of what you'd like the API to look like. I have a few more features I'd like to add (like a sparsity constraint), but I imagine they should probably wait. |
Hi. I'm still experimenting with possible options for refactoring, but good news is that most likely it won't be as global as I expected and it should be safe to merge code for normal and conditional RBMs now. So if you have time, please go ahead! Also it will be really nice if you add sparsity constraint - this is one of the features I always wanted, but never had time/need to implement myself. |
Alright, as far as I can tell I've got basic sparsity and weight decay penalties working here. I'll update conditional.jl and submit a PR, if you're okay with how I've implemented them. |
Hi @Rory-Finnegan, @dfdx , thanks for continuing to work on the Boltzmann.jl package. As you know, our fork ( sphinxteam/Boltzmann.jl ) has made a number of large changes, refactoring, etc. Please feel free to adopt any changes you want into this main repository. We were working on a number of different research avenues, so we didn't really stop to take the time to propose any pull requests into dfdx/Boltzmann.jl . Specifically, take a look at the monitoring functions we've added, we really found these to be useful for diagnosis and tuning in practice. We still have some "new" things, specifically the EMF learning from our NIPS paper, but the P/CD implementations should be good, too. |
@eric-tramel Cool, I stopped by your poster at NIPS, but didn't realize you were using julia for it. I think adding some of the monitoring code (maybe with Gadfly/Plots) and integrating in the EMF learning method would help make things more research friendly and help drive some refactoring. Looking through your code I think I'd like to refactor boltzmann to have a cleaner way for handling different common penalties and regularizations. For example, there are different approaches for doing weight decay, sparsity etc which should probably be decoupled from the RBM type and the |
I think we can decouple these things without Julia's dispatching and carefully measure performance. Basically, I don't expect function invocation to be a bottleneck - in most cases we should worry about matrix multiplication instead. I will play around with possible solutions today and see how to put it all together. |
@Rory-Finnegan All in all, your changes (sparsity and weight decay) look fine for me. Could you please make a pull request so we could close this issue and discuss possible refactoring separately? |
Sure. PR #20 opened. |
CRBMs seem like a simple addition to allow the Boltzmann.jl package to work on temporal datasets. From the paper it looks like we'd just need to add support for using the visible vectors from the previous timestep(s) as additional biases. Would anyone else be interested?
http://www.cs.toronto.edu/~fritz/absps/fcrbm_icml.pdf
The text was updated successfully, but these errors were encountered: