-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Trace without Turing #17
Conversation
Pull Request Test Coverage Report for Build 397231237
💛 - Coveralls |
Codecov Report
@@ Coverage Diff @@
## master #17 +/- ##
===========================================
+ Coverage 14.04% 25.88% +11.83%
===========================================
Files 4 5 +1
Lines 299 425 +126
===========================================
+ Hits 42 110 +68
- Misses 257 315 +58
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, @devmotion. I carefully reviewed the PR - didn't notice anything major. See some minor comments below.
return new{S,typeof(vi),typeof(model)}(model, spl, vi) | ||
end | ||
struct Trace{F} | ||
f::F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever trick to unify Function
and TracedModel
!
Maybe consider making TracedModel
and TracedFun
as subtypes of AbstractMCMC.AbstractModel
? It slightly improves clarity, and also prevent potential misuses.
https://github.com/TuringLang/AbstractMCMC.jl/blob/master/src/AbstractMCMC.jl#L48
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe consider making TracedModel and TracedFun as subtypes of AbstractMCMC.AbstractModel? It slightly improves clarity, and also prevent potential misuses.
Yes, maybe that could be done. On the other hand, such type constraints can be quite restrictive and it didn't seem necessary to introduce them at this stage. Also maybe even FixedModel
would be more appropriate since TracedModel
wraps a Model
and a specific sampler and VarInfo
object (I guess it should also include the RNG but this is not part of the current implementation and hence I did not want to change it in these PRs).
res.ctask = copy(trace.ctask) | ||
return res | ||
end | ||
const Particle = Trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of differentiating Particle
and Trace
no longer holds since AdvancedPS
is independent of Turing. We can probably clean up the terminology, e.g. by getting rid of Trace
and use Particle
everwhere. But this can be done in a separate PR as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
res.ctask = ctask | ||
return res | ||
# add backward reference | ||
newtrace = Trace(f, ctask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is slightly unclear how deepcopy(vi)
is performed for f::TracedModel
here.
EDIT: I found the answer in the companion PR for Turing.
Ref: https://github.com/TuringLang/Turing.jl/pull/1482/files#diff-f2a243e03d83fd80c6af74a78557a37eb7bcf265c509d29b9faa7ec19873b0c5R20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the deepcopy(vi)
parts are a bit annoying and prone to bugs. They should all be handled in the Turing PR and it should be exactly the same as in the current implementation. However, I noticed some places where some operations seem redundant (e.g., deepcopy
of a deepcopy
or multiple calls of reset_num_produce!
) but it was a bit difficult to figure out during the split where things could be dropped so I kept everything for now.
return res | ||
# add backward reference | ||
newtrace = Trace(f, ctask) | ||
addreference!(ctask.task, newtrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to wrap these into a new function - it makes the code more readable!
|
||
# step to the next observe statement and | ||
# return the log probability of the transition (or nothing if done) | ||
advance!(t::Trace) = Libtask.consume(t.ctask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sensible to use advance!
here. But if we are going to support the Iterator
interface, shouldn't it be iterate
? See https://docs.julialang.org/en/v1/manual/interfaces/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. But I noticed that this would require additional changes since we have to propagate the state of the iterators in some way, if we do not assume that they are mutating. Hence I did not switch to the interface in this PR.
reset_logprob!(t::Trace) = nothing | ||
|
||
reset_model(f) = nothing | ||
delete_retained!(f) = nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note for future, I think these types and abstract functions for Trace
and VarInfo
should go into a lightweight package AbstractPPL
. Then we should gradually switch to AbstractPPL
in DynamicPPL
and talk with other PPL library developers (e.g. Birch, Soss, Gen) to create a minimal PPL base package in Julia (and beyond!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice goal but I think it needs more time. I am not completely convinced that the methods and abstractions in this PR are the best ones for future development, so I guess it might be useful to stabilize some API in AdvancedPS first 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's just keep an eye on what kinds of functions end up being useful to see where generalizations are possible.
# of other samplers. | ||
increase_logweight!(pc, i, score + getlogp(p.vi)) | ||
# Increase the unnormalized logarithmic weights. | ||
increase_logweight!(pc, i, score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, score + getlogp(p.vi)
is necessary for Gibbs sampling to work since some variables might be updated outside particle Gibbs. I am slightly confused by how the new mechanism handles such cases?
EDIT: Found the answer in Turing's companion PR. Maybe consider adding a comment here to avoid future confusion.
Ref: https://github.com/TuringLang/Turing.jl/pull/1482/files#diff-f2a243e03d83fd80c6af74a78557a37eb7bcf265c509d29b9faa7ec19873b0c5R24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation for this change is that the setup felt too Turing-specific - it seemed like it should be possible to just implement a model where produce
yields the correct log probabilities right away. Also getlogp
would introduce a dependency on DynamicPPL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any major issues here, particularly since this is more of an incremental step.
It's a little tough to see exactly what AdvancedPS should be as a standalone tool since so much of Turing built up around it. I'll have more comments as we move forwards, I think.
In this PR I removed all Turing-specific parts from Trace. As in the last commits, the intention was to keep otherwise the same implementation, with all its flaws and annoyances, to both reduce the number of changes and to speed up the process of integrating AdvancedPS in Turing. Therefore I also prepared a corresponding PR to Turing that adds AdvancedPS and uses
AdvancedPS.Trace
and the implementation of resampling, propagation and particle sweeps in AdvancedPS.Since I tried to avoid any significant changes, some of the API might still be too Turing- and Libtask-specific. However, it already allows to use general models that yield the log probabilities of the observations with
Libtask.produce
.