VarInfo Goals

This issue contains my thoughts and comments from working with `VarInfo` a lot more during the course of TuringLang/Turing.jl#793. My experience is that `VarInfo` is somewhat easy to use once you get over the very steep learning curve, but that learning curve can be a powerful deterrent to development from outside folks.

I make some strong statements here to encourage discussion. I'm not trying to bash anyone's superb contributions (particularly @mohamed82008's great work on `VarInfo`), I just want to see if I can provoke some high-level thinking about `VarInfo` without thinking too much about what it is right now. Try to keep the context of this discussion about what `VarInfo` could be and not what it is now.

# I don't want to see `VarInfo` anywhere

I think we see `VarInfo` too much. If I'm a non-Turing person and I'm building some kind of inference tool, I don't want to learn about our arcane system for managing variables. I just want to manipulate parameters, draw from priors, etc. Many of our functions should probably never have a `do_something(vi, spl)` signature -- we should find ways to handle everything on the back end without anyone worrying about how to use `VarInfo`. A better way would be to have the `VarInfo` stored somewhere in a shared state or tied to a specific model. 

I can imagine a case where `VarInfo` is stored in some environment variable or state variable or something, and the sampler or model might just have a location to go look at where the `VarInfo` is. Then you could just call `logp(model)` and by default it would calculate the log probability using whatever the current state is. If you really wanted to, you could pass in a `VarInfo` and work with a specific one if you're doing a lot of numerical work and such, but I think for almost all cases `VarInfo` could sit far away and never be thought about.

An alternative fix would be to just have a very small handful of variables that are dead-simple to use an understand. See TuringLang/Turing.jl#886 for a better discussion.
 
- `update!(vi, new_vals)` should update the parameters. 
- `parameters(vi)` should get the current parameterization in a `NamedTuple` or `Dict` format. 
- `logp(vi, model)` should give you a log probability, no questions asked and no hassle. 
- `priors(vi)` should give you a `NamedTuple` or `Dict` of prior distributions to draw from. 
- If I want to change my priors or something, we should have a way to do that too. `priors!(vi, new_priors)` should set my priors to whatever the new distributions are.

`VarInfo` is ultimately my biggest issue with Turing's internals. I understand why we need it and it is a masterful work of engineering, but from a usability side it is a disaster, particularly if our goal is to have a high degree of ease-of-use for inference designers. 

If you asked me a question on how to do something with a `VarInfo` right now, chances are very good it would take me more than an hour to think about what it is that `VarInfo` is, what it does, and where in the source code I might find an answer. Add another half hour because whatever it was I though `VarInfo` was is not true.

# Where should `VarInfo` live?

I'm not sure where the `VarInfo` should go. I don't think it should be a free-floating entity like it has been in Turing's past, and I'm also not convinced that it's attachment to the sampler state as in TuringLang/Turing.jl#793 is correct either. 

Is `VarInfo` more a function of the model, or of the sampler? If it's more specific to the model, shouldn't we store it there? I don't really know. If it's in the model, then it's quite nice to use for non-MCMC methods, since nobody would have to add `VarInfo` to their method -- they can just call the model's version. Ultimately the `VarInfo` is constructed from the model, and the samplers just reference it. Right now I'm learning towards moving `VarInfo` over to the model, but I'm open to discussion on that.

A downside to putting it on the model side is that it becomes harder to build new modeling tools on top of Turing, but easier to build inference methods. I think it's a trade-off that's worth considering.

# Removing the `Sampler.info` field

Build a `VarInfoFlags` struct that handles all the various switches and gizmos and whatever that `VarInfo` uses. Currently, all the `Sampler`s have a dictionary called `info` in them which will no longer be used on the inference side after TuringLang/Turing.jl#793. It'd be nice if we could remove the field entirely and separate the `VarInfo` flags from the `Sampler` either by storing the flags in the `VarInfo` itself, or at least removing the dictionary by just storing the flags with the sampler.

This is really more mechanical than goal-oriented, and it's just something I or someone else might need to apply some elbow grease to.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VarInfo Goals #7

I don't want to see `VarInfo` anywhere

Where should `VarInfo` live?

Removing the `Sampler.info` field

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VarInfo Goals #7

Description

I don't want to see VarInfo anywhere

Where should VarInfo live?

Removing the Sampler.info field

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

I don't want to see `VarInfo` anywhere

Where should `VarInfo` live?

Removing the `Sampler.info` field