-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call new deserializer backend for constrained reads #856
Conversation
I think the MIR should retain the separation of constrain and read
operations - if we need to merge them for the Stan Math backend, we can do
that in that back-end. Other backends don't combine those operations.
…On Fri, Mar 12, 2021 at 1:49 PM Rok Češnovar ***@***.***> wrote:
End-to-end compile tests
Can you explain a bit more what you mean by this? Do you mean test that
models compile? If so we have that test running in our Jenkins pipeline.
The top one in the image below compiles all integration/good and all models
in the stan-dev/example-models repo.
[image: image]
<https://user-images.githubusercontent.com/28476796/110984270-32aefc00-836b-11eb-8a1d-18ec14962a15.png>
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#856 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGET3AKFYSYRQY5V3SKZILTDJO2FANCNFSM4ZCYJ5UA>
.
|
Looking over the code now, one note, you need to change the line here to
|
alpha_v = in__.vector(k); | ||
assign(alpha_v, nil_index_list(), | ||
in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__, | ||
k), "assigning variable alpha_v"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can change this if in deserializer if it's a lot easier on the compiler but this line
in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__, k);
Should just be
in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);
As jacobian calcs are only needed when calling constrains I made the read()
methods not take in the jacobian template value or lp.
Also we can shorten all of this up to remove the fill etc. for parameters and just have
Eigen::Matrix<local_scalar_t__, -1, 1> alpha_v = in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I missed that thanks, shouldn't be hard.
Yeah, we should get rid of the fills, but I think that'll be a separate solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that could be a different matter. I think can check for Decl
and do something with that
Hi @seantalts - couple of points, IMO: The transformed MIR is already backend-specific, so it's okay if we tailor it to the Stan Math backend. Other backends can read the transformations from out_vars and generate constraints in their own way. Also, I think it's much easier to read a single MIR node and produce multiple codegen statements than it is to parse multiple MIR nodes back together into a single codegen statement (though again, this shouldn't be a problem because they're inserted in transform_mir). |
Thanks @rok-cesnovar, I hoping that was the case |
Agreed :) I feel like there are a few major projects underway in that area but once things settle down I'd be happy to take a crack at adding a Stan Math LIR form. I was (over?)reacting to
How will constraints be represented in the MIR pre-Stan_math_backend.Transform_mir? |
@@ -6729,38 +6744,43 @@ class restricted_model final : public model_base_crtp<restricted_model> { | |||
p_real = std::numeric_limits<double>::quiet_NaN(); | |||
|
|||
current_statement__ = 1; | |||
p_real = in__.scalar(); | |||
p_real = in__.read<local_scalar_t__>(lp__); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close! Just gotta get rid of the lp__
for the read()
methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woooooops
Yeah an LIR might be really nice - I wonder if we could get rid of the CompilerInternal calls altogether
I think it's enough to represent constraints as the |
| {Expr.Fixed.pattern= Lit (Str, constraint_string); meta= emeta} | ||
:: {Expr.Fixed.pattern= Lit (Int, n_constraint_args_str); _} :: args -> | ||
let n_constraint_args = int_of_string n_constraint_args_str in | ||
let constraint_args, dims = List.split_n args n_constraint_args in | ||
let lp_expr = Expr.Fixed.{pattern= Var "lp__"; meta= emeta} in | ||
let arg_exprs = constraint_args @ [lp_expr] @ dims in | ||
let maybe_constraint, maybe_jacobian = | ||
if String.is_empty constraint_string then ("", "") | ||
else ("_" ^ constraint_string, ", jacobian__") | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think up here for not having lp__
for plain reads deserializer.read<>(dims...)
you want to check if constrain_string
is blank (or if constrain_args is size 0? either or) and if not then don't include lp__
as an arg
if String.is_empty constraint_string then ("", "") | ||
else ("_" ^ constraint_string, ", jacobian__") | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for here @bbbales2 and I are talking about changing the methods names to
read() : just plain read (this one stays the same)
read_{constrain_name}() -> read_constrain_{constrain_name}() : reading in variables that need a constrain
to match the signature for the read free functions
read_free_{constrain_name}() : reading in variables that need a free transform
But that would just involve changing this line to
if String.is_empty constraint_string then ("", "") | |
else ("_" ^ constraint_string, ", jacobian__") | |
in | |
if String.is_empty constraint_string then ("", "") | |
else ("_constrain_" ^ constraint_string, ", jacobian__") | |
in |
We should have that sorted by today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, lemme know
In most backends we'd actually call a function here. In general the MIR is supposed to be lowering us towards procedural code as much as possible while being general across backends. It can retain higher level information but not rely on it out-of-band like that - you want everything you need to generate code to be there at your MIR node. |
I see where you're coming from. In this case, I think we're being as low level as possible while being backend agnostic - it wouldn't be backend agnostic to put in a separate Constrain call, because the Stan Math backend would have to undo that work. In a sense, it's the Decl nodes that represent the constraints in the MIR statements, because we're expanding the Decls just like we've been doing for reads. Just like we've been doing for reads, we're expanding parameter Decls with information from output_vars. To put more of the information inline into the MIR, what do you think about adding |
@rybern fyi I pushed up the just changing |
@SteveBronder great - I'll be free to help finish up this afternoon |
…stexpr bool jacobian__ to write_array_impl, change name of read_cholesky_corr to read_cholesky_factor_corr
assign(L_Omega, | ||
in__.template read_constrain_cholesky_factor_corr<std::vector<Eigen::Matrix<local_scalar_t__, -1, -1>>, jacobian__>( | ||
lp__, nt, 2, 2), "assigning variable L_Omega"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright so fixing this bit here got param-constrain.stan
to compile for me (after we add the little fix in this PR here since we forgot to include the deserializer header lol)
The model is
data {
int nt;
int NS;
}
parameters {
cholesky_factor_corr[2] L_Omega[nt];
vector<lower=L_Omega[1,1,2]>[NS] z1;
}
Is making the nt, 2, 2)
where the deserializer signature the dimensions to just be nt, 2)
(here) since then we grab a lower triangular of data with # of cells equal to (K * (K - 1)) / 2)
. I think this is caused by get_dims in the transform mir stuff for let read = ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooo, it's possible that we could have a problem somewhere. cholesky_factor_cov
supports both: cholesky_factor_cov[M]
and cholesky_factor_cov[M, N]
(according to here). I think it's the only matrix parameter like this (that has a non-square version), so watch out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I say it's possible there could be a problem not cause I know -- just cause it is a little different and but sounds related to what you're talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also do this pattern for read_constrain/free_cholesky_factor_corr
, read_constrain/free_cov_matrix
, and read_constrain/free_corr_matrix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aight I just pushed something up that I think will gen the thing we want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ooff Ben just seeing your comment. I think that's fine? The signature for read_constrain_cholesky_factor_cov
is
read_constrain_cholesky_factor_cov(LP& lp, Eigen::Index M, Eigen::Index N)
And in the thing I wrote we handle that the normal way, so hopefully should be fine
@seantalts lil' bump! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seantalts both @rok-cesnovar and I looked this over and we think it looks good. If you have time to look it over tonight one more time that would be super rad but this pr being merged is holding up some other PRs so we'd like to merge it tmrw if you don't have time to look at it
Thanks @rybern, this is looking good, I like the refactors and think this is going in a good direction. At the minimum I think this should have the MIR test I asked for - I can add that this weekend if need be. |
| Some expr -> | ||
let vars = Set.Poly.map ~f:fst (expr_var_set expr) in | ||
node_vars_dependencies info_map vars label | ||
let vvars = Set.Poly.map ~f:fst (expr_var_set expr) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it stand for something? I'm mostly asking because this used to be called var
and now it's vvar
and I'm not sure what else changed / what the semantics of the change are here
I'm trying to figure out how hard it would be to just do the unconstrain param refactor at the same time - @SteveBronder do you know if the Stan Math part of that is ready? Ideally we wouldn't leave stanc3 with constrains and unconstrains having totally separate code paths, MIR representations, and backend calls with all of the duplicated stuff that implies. |
I am guessing this adds the serializer: stan-dev/stan#3019 |
Add a transformed_mir_diff expect test for #856 to diff against
@seantalts Thank you for putting time into this! I totally agree that it's uncomfortable to merge it in it's half state. This way just made sense given the plan for a followup PR. The unconstrain piece is next up whether it's in this PR or a new PR, I just didn't want to write it as a new PR until this one got the all-clear. I don't think it'll take long especially if we have examples of what the output should look like. Also, good call on making Fun_kind non-parametric for now. I imagine the internal function type could be a parameter of MIR later on so that backends can provide their own, but no need for now. |
All the serializer and deserializer code is in Stan. The reason I wanted to break this up into two is because right now An alternative to the dual merge thing is to hard code the current transform inits method like we do with inline void transform_inits(const stan::io::var_context& context,
Eigen::Matrix<double, Eigen::Dynamic, 1>& params_r,
std::ostream* pstream = nullptr) const final {
std::vector<std::string> constrain_param_names;
context.names_r(constrained_param_names);
std::vector<double> serialized_val_r();
serialized_val_r.reserve(context.total_vals_r_size(), 0);
size_t sentinal_iter = 0;
for (size_t i = 0; i < constrained_param_names.size(); ++i) {
std::vector<double> init_val = context.vals_r(constrained_param_names[i]);
for (size_t j = 0; j < init_val.size(); ++j) {
serialized_val_r[sentinal_iter] = init_val[i];
sentinal_iter++;
}
transform_inits_impl(serialized_val_r, params_i, vars, pstream);
} into the model. This would let us have backwards compatibility. I'm fine with doing it either way, I have a branch of the Stan repo with the changes we would need to make for the dual merge here |
I realized that's what you were probably thinking after I pushed it up, haha. I think I agree it's slightly better to add the parameter when it's time to use it. Also I'm hoping I or someone else can take a crack at a LIR for Stan Math sometime, in which case it probably just has its own types.
Doesn't that change the API and require bumping Stan's major version number? Or do we have some backwards-compatible route planned? |
Sorry for the double post - what does "dual merge" refer to here? Splitting |
Oh I didn't know that would cause a version bump. I think we can do the backwards compatible version I posted above.
I didn't know what to call it so I just made that term up lol. It's like, you know when we have a PR in one of our repos that causes something to fail in the upstream repo so we make a branch in each repo with the changes, test both of those against one another, then accept and merge both. Like a good example is the |
I think that transform_inits is part of the API used by RStan, PyStan, et al. I think this is basically true of all model class methods on Stan. Perhaps some day we'll all agree to have the API be that exposed by cmdstan, but I think for now that's still a pretty public facing endpoint. That said I haven't been to meetings a year or so, maybe things have changed. But it's good you have a backwards-compatible version ready. I think re: "dual merge" we used to call them cross-repo features or PRs. Is there anything going on in this space for Stan constraint checks? e.g. data that is declared as lower=0. Does anyone know if there are vectorized versions of that as well? Would be nice to delete that too from the MIR. |
PS re: |
Yes, there is, see #552 Also see the test file: https://github.com/stan-dev/stanc3/blob/master/test/integration/good/code-gen/transform.stan |
I'm fine with merging this now. Looking forward to the unconstrain version. Thanks all! |
Replacing #837
This is meant to be a minimal PR to call the new backend for constrained reads (merged backend PR here: stan-dev/stan#3013)
For example, what was
is now
The strategy is just to:
Work still to do: