-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hamiltonian Monte Carlo #144
Comments
Hi! I've been watching all this progress on monad-bayes with great interest! I use Stan a lot, using Haskell to manage the data and write the Stan code and then hs-cmdstan and CmdStan to run the models and then Haskell again to process and analyze the results. |
@adamConnerSax I guess you know about this? http://www.cs.nott.ac.uk/~psznhn/Publications/stancon2018.pdf |
I hadn't! Nice. Is that work ongoing? I agree with the paper that keeping the post-processing bits and the model in sync can be tricky, but if you want to utilize the actual structure of the posterior you need to have Stan do it's "generated quantities" thing (or do something similar yourself with the output of each draw). I just generate many Stan files with various generated quantities blocks and run them using the "generate quantities" mode of CmdStan. So I only run the model once but get all the generated stuff. There's some book-keeping but I don't find it so onerous. Which I guess is why I am trying to make it easy to write Stan from Haskell. Edit: Ah. I see I misunderstood. That library keeps all the Stan samples to use for prediction or whatever. That makes more sense. I can see how that would be useful and maybe easier and faster than using Stan's generated quantities block. |
Hi! Sorry for the slow response. Yes, it sounds interesting to compile monad-bayes into Stan. I can imagine that taking the Free monad representation of a probabilistic program and writing it as Stan might actually be possible. Happy to chat more about that. The HMC in monad-bayes is going to be a reference implementation (i.e. much slower than Stan, most likely), at least until Haskell has a better autodiff story - which might be a while. If you have more thoughts about use cases, or things you'd like to be able to do in monad-bayes, do let me know :) Edit: your code looks cool! How operational is it? What would need to be done to connect it with monad-bayes? |
No worries about the time! This is all on my slow track anyway. The code I pointed you to is my newer version of an old thing which is much less strongly typed but is operational. I'm a couple weeks away (slow-track!) from the new version taking over from the old. Then I'd need to do a lot of cleanup in the part that handles running via cmdstan. The new part is just building the Stan code from a Haskell AST. But then I have a bunch of code for managing data, checking if code has changed and re-running only what's necessary, etc. Not sure how much of that would be worthwhile or if it might be better to just have a translator as a first step? What I've not thought about at all is how the interface with monad-bayes would work. Maybe when I come up for air after switching to the new expression/statement AST, I'll look at monad-bayes more and see if I can figure out what that might look like? I think monad-bayes is more general (part of why I'd love to see them connected, so I could write things to use all the tools monad-bayes makes available but have Stan for speed when the model allows). And I've no idea how to think about the organizational aspects: the way Stan breaks things into generated quantities, etc. Either there will be something beautiful just allowing me to know what's in the model and what's a generated quantity--this feels vaguely possible since maybe the "model" is only things which affect the "target" log-likelihood and anything which is required by them. But that might all be hard to manage without some sort of annotation-layer or something to guide the translation. But I don't know yet! I am looking forward to thinking about it. |
Sounds great! Feel free to ask questions if at any point (once you come up for air) you're wanting to understand monad-bayes better. Also check out https://monad-bayes.netlify.app/ , which goes through the API and implementation of monad-bayes fairly extensively. One thing I'm keen to try with monad-bayes is to actually write some non-trivial models with real world data and see where the speed bottlenecks are, so this would be a good incentive. |
This would be a really cool addition! Let me know if you need some help!
Tbh that sounds like a fork is needed. At least a fork of the parts that contain the distributions. |
Just checking in. I still want to try to produce Stan code from monad-bayes somehow. My Stan AST work is fairly operational though pretty sparse in terms of supporting all the math functions and distributions that Stan has. But that is easy enough to expand. What is stopping me is that I'm not at all sure if the way monad-bayes represents the model is straightforward enough to "translate" to Stan. My best guess is that I would need to develop a free-monad sort of thing which has a But a few things are immediately tricky. E.g., I don't think Anyway, I'm starting to think about it! If anyone has concrete ideas for how to get the Stan sort of items from a monad-bayes model, I'd love to hear them. Once I have the data sources, parameters (and transformations) and the model (for priors and data), making that into Stan code should be straightforward. |
@turion Thanks! You can see the somewhat messy branch with the WIP attached to this PR: #171 You're probably right that a fork would be good. What I did was to use multiparam typeclasses for MonadSample and MonadCond, and then laboriously change to various instances. It wasn't entirely trivial, but I think it roughly works. Look in Control.Monad.Bayes.Class to see the new version of the MonadSample and MonadCond typeclasses, and Control.Monad.Bayes.Traced.Diff for the progress on HMC. @adamConnerSax So there's currently a free monad instance of MonadSample defined in Control.Monad.Bayes.Free (soon to be Control.Monad.Bayes.Density.Free). Maybe one could work from that, or defined a Stan instance for MonadSample directly? As for To make this more concrete, if you have an example of a simple Stan AST corresponding to e.g. program = do
x <- beta 1 1
factor (normalPdf 0 1 x)
return x that might be helpful to see, to understand how to do the translation. |
Funny! I've just recently started to toy around with a Haskell frontend to Stan. Uploaded it to github now: https://github.com/turion/mc-stan. An internet search for anything like that didn't turn up much, because unfortunately there already is a tool called I think it's not possible to produce Stan code from much of f :: Double -> Double
f = get creative here
do
x <- random
let y = f x
uniform y $ y + 1 There is no chance to translate an arbitrary function I think |
Ah sorry, didn't see your comment while writing this novella 😆
Yes, great! I think I would have recommend an associated type instead because the monad will usually determine the type of real numbers, but your version forces every monad to be polymorphic in it. If you do this: class (RealFloat (Real m)) => MonadSample m where
type Real m :: *
random :: m (Real m)
-- ...
instance RealFloat n => MonadSample (Enumerator n) where
type Real (Enumerator n) = n
-- ... You can be more expressive that way, because you could also define specific instances like: instance MonadSample IO where
type Real = Double
uniform = someUniformIOImplementation Plus, you don't break all your users' polymorphic code of type
I think the performance of Stan will only be unlocked if one translates the distributions symbolically, so there's no way around mirroring the Stan syntax to some extent. |
@turion Yep, I also think it's hard to translate monad-bayes to Stan as is. But maybe there could be a more abstract way of specifying models which could then be implemented as current monad-bayes or Stan? Like a Bayesian-network sort of thing? BTW, in your Haskell frontend, you mention prior work. One of those is mine and you are welcome to use it if it's useful. A more current version is here: https://github.com/adamConnerSax/haskell-stan. I like the AST building parts and I am less convinced that the parts which manage data and the interface with cmdstan are ideal. But YMMV. Let me know if I can make parts of it more useful! |
@turion Ah, this is an awesome suggestion, and it seems like the way to go. I'll try it out soon. There's an old branch of monad-bayes that does something similar, but not quite the same (see here: https://github.com/tweag/monad-bayes/tree/autodiff), and that branch had actually even implemented HMC. But I found it difficult to understand, and the types were complicated, so I'm inclined to just start anew and bring in pieces when needed. By the way, I noticed from your website that you've done a lot of FRP (e.g. dunai). This is a bit of a tangent, so maybe I can switch it to a different issue, but do you have thoughts about implementing real-time inference (as in https://arxiv.org/abs/1908.07563) using Yampa/dunai? The idea of modelling a stochastic process or real-time inference using FRP seems fitting. @adamConnerSax I agree that it would be nice to have a separate way of building bayes nets and MRFs that reifies the structure, and which is then convertable into either monad-bayes or Stan. Something like just declaring a graph structure explicitly. I don't in principle see what the roadblock would be there. The only reason I hadn't pursued it before is that I wasn't entirely sure what was an appropriate graph library in Haskell. |
@reubenharry I have wanted to get a better handle on algebraic-graphs. There are benchmarking results (from 2018) for a number of Haskell graph libraries. |
Hah, you've guessed my end game! 😆 Yes, FRP (with sideeffects!) and stochastic processes are a perfect match, and the ProbZelus people are exactly right in pursuing this path. (It seems that it is my fate to always reimplement in Haskell everything that Marc Pouzet works on.) I'll open a separate issue if you want to discuss this and share some ideas. |
@turion The topic sounds interesting! I would love to follow along a discussion on it. |
@turion Oh, excellent! Yes, please do :) I am also very interested in this, and would love to help. |
@turion I am sure you are aware that you need to be a bit careful with FRP - for example, I think the package I used was using RK4 so that if you e.g. model planetary motion then energy is not conserved and the planets either spiral into the sun or fly off into the void. Maybe what you are working on allows the user to have more control over the numerical method? |
@idontgetoutmuch To clarify, I believe this is FRP not for symplectic integration or anything to do with HMC, but for real-time inference, along the lines of https://arxiv.org/abs/1908.07563 . But yes, let's discuss on a separate issue |
Is there a branch for any of the HMC work anywhere? @mknorps and I are going to take a look at this but we don't want to re-invent any wheels. |
Hi! I wanted to chime in here, not sure if it is the right place. I was trying to use What I am trying to say: Even if you implement HMC, it is important to provide abilities to the user to set the leapfrog parameters, and, even better, to do this automatically. (Auto tuning; but there are parameters specifying the way auto tuning is happening). Even when using the NUTS sampler, (auto) tuning is absolutely mandatory. My question is: Can tuning be implemented into I am specifically interested, because I am working on Thanks! PS: Anything with EDIT: One more thing because I read it in the main post of this thread: Riemannian HMC seems to be harder than what we thought, I do not think it is impossible to implement, but so far, nobody has done that convincingly (at least to my knowledge). |
Hi @dschrempf, great to hear from you! What you're saying makes a lot of sense, and I agree. To my knowledge, Gen is the probabilistic programming language with the most support for customizable MCMC proposals for arbitrary probabilistic programs. I think the best approach for monad-bayes in the long run is to borrow from Gen, and if I had more time, that's what I'd work on (see e.g. https://github.com/psg-mit/probzelus-haskell). In theory, having customizable proposals is easy to do in monad-bayes, but automating the user experience to make writing ill-formed proposals impossible is hard. If you're interested in trying to incorporate some of the Re. HMC, there is some work on applying it to distributions with stochastic control flow, but in general it assumes a fixed model, so its applicability to monad-bayes rests on that assumption. |
See here for the source: https://github.com/tweag/monad-bayes/tree/Generalising-the-sampler. You can run the entire document by
You can generate the document by
You might need to do |
Thanks for writing this up! Extremely useful for me :) |
I am not sure what you are referring to - if it's my notes, please be very careful, I now think it contains some misunderstandings. |
It's your notes, yes. I haven't started reading yet, and will hold off / be careful, but what I meant is that I've been wanting to learn about this new generalized perspective on MCMC for a while, so I envision this might be useful in future. |
My current plan is to use the code here: https://github.com/fzaiser/nonparametric-hmc as a template. My AttemptsSome ReferencesNonparametric Hamiltonian Monte Carlo Automating Involutive MCMC using Probabilistic and Differentiable Programming Involutive MCMC: a Unifying Framework LazyPPL: laziness and types in non-parametric probabilistic programs https://github.com/idontgetoutmuch/nonparametric-mh LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models Probably TangentialReactive Probabilistic Programming Affine Monads and Lazy Structures for Bayesian Programming Delayed Sampling and Automatic Rao–Blackwellization of Probabilistic Programs Compiling Universal Probabilistic Programming Languages with Efficient Parallel Sequential Monte Carlo Inference |
Hamiltonian Monte Carlo (HMC) is a widely used, gradient based, MCMC algorithm, that is the backbone of Stan's inference. I plan to implement it for monad-bayes. Todos (checkboxes indicate things done on branch):
f :: Floating n => [n] -> n
(Or: can I use Numeric.AD.Double?)Double
to any number, which is needed for automatic differentiation off
Optional:
I have the basic pieces working on branch, but the checkboxes indicate further steps. Current roadblocks:
Double
. I need versions that are in terms of arbitraryNum n => n
, for autodiff to work.The text was updated successfully, but these errors were encountered: