-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring #19
Comments
Yeah, that was very similar to what I was thinking. +1 Since I don't tend to consider fields as part of the public interface, I'd be inclined to have the interface only define required methods rather than fields, but that's just me . This is probably a minor implementation detail, but if there was a way of making the update functions more composable while remaining relatively performant that would be ideal. For example, there are several methods for computing sparsity and decay penalties with different parameters and it would be nice if the update function could take these arbitrary calculations as functions. I suppose that would probably result in less efficient matrix operations though...? I suppose another options could be to provide extra As for the ConditionalRBM, this would cover my use cases fine. |
I imagine it more like
I thought about something like this:
(There should be better syntax for setting update function, but we can figure it later.) |
Alright, that makes sense. Thanks. |
Hi, I just came across this paper and was wondering if the refactoring could include a way of setting the floating point precision used? |
I see no fundamental limitations for this, though API is unclear for me know. |
What about just parameterizing the RBM and adding a precision keyword to the constructor? |
This is exactly what I'm experimenting with right now in branch
I believe internally in non-BLAS operations on arrays of floats < On other hand, BLAS can operate on |
Yeah, it still might be useful to give the option of at least using Float32. I imagine the only times Float16 might be useful is for network transfers in distributed learning rules, which is probably outside of the scope here. |
@Rory-Finnegan, @dfdx: The modifications you both describe seem like a good direction for the repository. Especially the application of compassable update functions, this can certainly allow for a lot of modularity in the kinds of regularizations that get applied to parameter updates. Speaking of modularity, perhaps one should consider addressing the possibility of non-binary distributions for the visible units (and perhaps even the hidden units). @dfdx had an initial implementation of Gaussian-distributed visible units, for example. (We're working on some extensions on this point, we'll have something to say about it in the next months). However, using general priors would require a change to the RBM structure to include fields for an arbitrary number of prior parameters at each variable on the hidden and visible side. E.g.
In the case of binary distributions, these Param fields resolve to the normal Additionally, the gradient updates on variable distribution parameters should be handled differently from the updates of the couplings, |
I have already expanded it in Also, I don't think we really need to change Instead, for custom distributions we might add 2 more fields (I will follow your notation except for naming conventions) -
Note, that I removed Does it sound reasonable? |
@Rory-Finnegan I'm near the end of refactoring conditional RBMs now, but I'm not sure how to test if they still behave correctly. For ordinary RBMs I can take, say, MNIST and inspect fitting result visually, as well as check final pseudo-likelihood. Do you have any dataset / example that I can use for testing conditional RBMs? |
The dataset I'm currently testing on isn't publicly available, but I believe Graham Taylor used MNIST in one of his papers. The general approach used was train the CRBM to condition the clean image on a noisy version of it. For testing, you can look at the accuracy of predicting "cleaned" image given their noisy counterparts. If you want I can write these tests. |
It would be perfect. Then we will be able to use such tests as a usage example as well. |
Sounds good. Is there a branch you'd like me to add that to? |
You can add it to the |
PR #21 opened with those tests. |
Great, thank you! I'm going to create a PR for refactoring tomorrow so you could check if new design matches your thoughts / needs. |
I just realized that the conditional rbm could be generalized a bit better, should I make a PR to master or wait till the refactoring gets merged? The issue is that I'm currently just conditioning on previous observations, but there isn't any reason you couldn't condition on arbitrary things. |
@Rory-Finnegan Does it mean that we can use ConditionalRBM for classification then? If so, it opens quite a lot of new possibilities :) PR #22 is now open. Please, check it out and if it looks mostly good, then you can base your changes on it. Otherwise, feel free to make a separate PR to master. |
@dfdx In theory, yes, you could use a conditional rbm for classification. You'd just need to softmax the label predictions. I'm not sure how common that is or if there are papers that compare it to other common methods, but we could always test it on the MNIST dataset and see how it does. The PR looks good to me, so I'll just add my changes to it. |
PR #23 to refactoring open. |
So 2 things I've thought of while going through the refactoring branch is
|
|
Randomization for batch order is here. I need one more day for flexible precision (should be trivial to implement, but I'm very short in time now). |
Awesome, thank you! I'm still getting some NaNs and Infs on the pseudo-likelihood calculation, so I'll probably have another PR once I figure out where they're coming from. |
Regarding flexible precision, I just pushed a change that converts each batch to an array of RBM's type in a Meanwhile, I see that this branch is mostly ok, so I'm going to merge it into master in 24 hours and add discuss new features separately. |
FYI, with the recent changes I'm getting a bunch of warnings like |
Basically not, they should be created during the first call to corresponding method (recall that context holds both - options and buffers, and these 3 are exactly the latter). Do you need to pass them explicitly for some reason? |
I wasn't explicitly passing them in, but I figured why I was getting the error and you haven't been. I was running multiple RBMs in a row with the same context, which resulted in |
Ah, that makes sense. I think we can simply copy input context (i.e. options only) instead of using and populating the original one. I will push the change. |
|
As recently discussed in #9, it will be nice to refactor the project to better reflect different kinds of RBMs and different ways to learn it. Currently I can think of the following variations:
RBM kinds
gradient calculation
weight update options
Given these requirements, I imagine minimal RBM setup to look something like this (for all but
CLRBM
types):where
gradient()
andupdate!()
are closures initialized with all needed options. Then a single methodfit()
can use these closures to learn RBM and provide monitoring.Note that this doesn't restrict implementations to the single type, but instead provides a contract, i.e. specifies fields required to provide compatibility between different methods for learning and inference.
@Rory-Finnegan @eric-tramel Does it makes sense for you? Do your models fit into this approach?
The text was updated successfully, but these errors were encountered: