Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call new deserializer backend for constrained reads #856

Merged
merged 37 commits into from
Apr 3, 2021

Conversation

rybern
Copy link
Collaborator

@rybern rybern commented Mar 12, 2021

Replacing #837

This is meant to be a minimal PR to call the new backend for constrained reads (merged backend PR here: stan-dev/stan#3013)

For example, what was

      sigma = in__.scalar();
      current_statement__ = 4;
      if (jacobian__) {
        current_statement__ = 4;
        sigma = stan::math::lb_constrain(sigma, 0, lp__);
      } else {
        current_statement__ = 4;
        sigma = stan::math::lb_constrain(sigma, 0);
      }

is now

      sigma = in__.read_lb<local_scalar_t__, jacobian__>(0, lp__);

The strategy is just to:

  • Stop generating FnConstrain in Ast_to_Mir
  • Pack the information necessary for the new constrained reads into FnRead, which is built in Transform_Mir
  • Change the codegen for FnRead to match the new interface

Work still to do:

  • Remove FnConstrain from Internal_fun
  • Make sure stanc3 test diffs are okay (I'll note a couple odd ones in line comments)
  • End-to-end compile tests

@rok-cesnovar
Copy link
Member

End-to-end compile tests

Can you explain a bit more what you mean by this? Do you mean test that models compile? If so we have that test running in our Jenkins pipeline. The top one in the image below compiles all integration/good and all models in the stan-dev/example-models repo.

image

@seantalts
Copy link
Member

seantalts commented Mar 12, 2021 via email

@SteveBronder
Copy link
Contributor

Looking over the code now, one note, you need to change the line here

to

      ; "stan::io::deserializer<local_scalar_t__> in__(params_r__, params_i__);"

Comment on lines 582 to 584
alpha_v = in__.vector(k);
assign(alpha_v, nil_index_list(),
in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__,
k), "assigning variable alpha_v");
Copy link
Contributor

@SteveBronder SteveBronder Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I can change this if in deserializer if it's a lot easier on the compiler but this line

in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__, k);

Should just be

in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);

As jacobian calcs are only needed when calling constrains I made the read() methods not take in the jacobian template value or lp.

Also we can shorten all of this up to remove the fill etc. for parameters and just have

      Eigen::Matrix<local_scalar_t__, -1, 1> alpha_v = in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed that thanks, shouldn't be hard.

Yeah, we should get rid of the fills, but I think that'll be a separate solution

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that could be a different matter. I think can check for Decl and do something with that

@rybern
Copy link
Collaborator Author

rybern commented Mar 12, 2021

Hi @seantalts - couple of points, IMO:

The transformed MIR is already backend-specific, so it's okay if we tailor it to the Stan Math backend. Other backends can read the transformations from out_vars and generate constraints in their own way.

Also, I think it's much easier to read a single MIR node and produce multiple codegen statements than it is to parse multiple MIR nodes back together into a single codegen statement (though again, this shouldn't be a problem because they're inserted in transform_mir).

@rybern
Copy link
Collaborator Author

rybern commented Mar 12, 2021

Thanks @rok-cesnovar, I hoping that was the case

@seantalts
Copy link
Member

The transformed MIR is already backend-specific, so it's okay if we tailor it to the Stan Math backend. Other backends can read the transformations from out_vars and generate constraints in their own way.

Agreed :) I feel like there are a few major projects underway in that area but once things settle down I'd be happy to take a crack at adding a Stan Math LIR form.

I was (over?)reacting to

Stop generating FnConstrain in Ast_to_Mir

How will constraints be represented in the MIR pre-Stan_math_backend.Transform_mir?

@@ -6729,38 +6744,43 @@ class restricted_model final : public model_base_crtp<restricted_model> {
p_real = std::numeric_limits<double>::quiet_NaN();

current_statement__ = 1;
p_real = in__.scalar();
p_real = in__.read<local_scalar_t__>(lp__);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close! Just gotta get rid of the lp__ for the read() methods

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woooooops

@rybern
Copy link
Collaborator Author

rybern commented Mar 12, 2021

Yeah an LIR might be really nice - I wonder if we could get rid of the CompilerInternal calls altogether

How will constraints be represented in the MIR pre-Stan_math_backend.Transform_mir?

I think it's enough to represent constraints as the transformations in output_vars, I don't think they need to be statements in untransformed MIR

Comment on lines 421 to 430
| {Expr.Fixed.pattern= Lit (Str, constraint_string); meta= emeta}
:: {Expr.Fixed.pattern= Lit (Int, n_constraint_args_str); _} :: args ->
let n_constraint_args = int_of_string n_constraint_args_str in
let constraint_args, dims = List.split_n args n_constraint_args in
let lp_expr = Expr.Fixed.{pattern= Var "lp__"; meta= emeta} in
let arg_exprs = constraint_args @ [lp_expr] @ dims in
let maybe_constraint, maybe_jacobian =
if String.is_empty constraint_string then ("", "")
else ("_" ^ constraint_string, ", jacobian__")
in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think up here for not having lp__ for plain reads deserializer.read<>(dims...) you want to check if constrain_string is blank (or if constrain_args is size 0? either or) and if not then don't include lp__ as an arg

Comment on lines 428 to 430
if String.is_empty constraint_string then ("", "")
else ("_" ^ constraint_string, ", jacobian__")
in
Copy link
Contributor

@SteveBronder SteveBronder Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for here @bbbales2 and I are talking about changing the methods names to

read() : just plain read (this one stays the same)
read_{constrain_name}() -> read_constrain_{constrain_name}() : reading in variables that need a constrain

to match the signature for the read free functions

read_free_{constrain_name}() : reading in variables that need a free transform

But that would just involve changing this line to

Suggested change
if String.is_empty constraint_string then ("", "")
else ("_" ^ constraint_string, ", jacobian__")
in
if String.is_empty constraint_string then ("", "")
else ("_constrain_" ^ constraint_string, ", jacobian__")
in

We should have that sorted by today

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, lemme know

@seantalts
Copy link
Member

I think it's enough to represent constraints as the transformations in output_vars, I don't think they need to be statements in untransformed MIR

In most backends we'd actually call a function here. In general the MIR is supposed to be lowering us towards procedural code as much as possible while being general across backends. It can retain higher level information but not rely on it out-of-band like that - you want everything you need to generate code to be there at your MIR node.

@rybern
Copy link
Collaborator Author

rybern commented Mar 13, 2021

I see where you're coming from. In this case, I think we're being as low level as possible while being backend agnostic - it wouldn't be backend agnostic to put in a separate Constrain call, because the Stan Math backend would have to undo that work.

In a sense, it's the Decl nodes that represent the constraints in the MIR statements, because we're expanding the Decls just like we've been doing for reads. Just like we've been doing for reads, we're expanding parameter Decls with information from output_vars.

To put more of the information inline into the MIR, what do you think about adding transformation option to Decl? Or maybe further, (constrainaction * transformation) option, so that backends don't need to know where to put Check/Constrain/Unconstrain? Then backends are free to generate whatever read or constrain calls they need to without looking at output_vars. On the one hand, I like that because the information is inline but can't be separated from the declaration, on the other hand, I don't want to do away with output_vars but I also don't like redundant representations that can get out of sync...

@SteveBronder
Copy link
Contributor

@rybern fyi I pushed up the just changing reader to deserializer in the gen code and promoting the tests

@rybern
Copy link
Collaborator Author

rybern commented Mar 18, 2021

@SteveBronder great - I'll be free to help finish up this afternoon

…stexpr bool jacobian__ to write_array_impl, change name of read_cholesky_corr to read_cholesky_factor_corr
Comment on lines 19014 to 19016
assign(L_Omega,
in__.template read_constrain_cholesky_factor_corr<std::vector<Eigen::Matrix<local_scalar_t__, -1, -1>>, jacobian__>(
lp__, nt, 2, 2), "assigning variable L_Omega");
Copy link
Contributor

@SteveBronder SteveBronder Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright so fixing this bit here got param-constrain.stan to compile for me (after we add the little fix in this PR here since we forgot to include the deserializer header lol)

The model is

data {
  int nt;
  int NS;
}

parameters {
  cholesky_factor_corr[2] L_Omega[nt];
  vector<lower=L_Omega[1,1,2]>[NS] z1;
}

Is making the nt, 2, 2) where the deserializer signature the dimensions to just be nt, 2) (here) since then we grab a lower triangular of data with # of cells equal to (K * (K - 1)) / 2). I think this is caused by get_dims in the transform mir stuff for let read = ...?

https://github.com/stan-dev/stanc3/pull/856/files#diff-a1f5ff85d2b828c19a91e01d5c1e8caaf75c47e8af9bb205a3e0ed486a7e064fR233

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooo, it's possible that we could have a problem somewhere. cholesky_factor_cov supports both: cholesky_factor_cov[M] and cholesky_factor_cov[M, N] (according to here). I think it's the only matrix parameter like this (that has a non-square version), so watch out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I say it's possible there could be a problem not cause I know -- just cause it is a little different and but sounds related to what you're talking about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also do this pattern for read_constrain/free_cholesky_factor_corr, read_constrain/free_cov_matrix, and read_constrain/free_corr_matrix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aight I just pushed something up that I think will gen the thing we want

3eaab9e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ooff Ben just seeing your comment. I think that's fine? The signature for read_constrain_cholesky_factor_cov is

read_constrain_cholesky_factor_cov(LP& lp, Eigen::Index M, Eigen::Index N) 

And in the thing I wrote we handle that the normal way, so hopefully should be fine

@SteveBronder
Copy link
Contributor

@seantalts lil' bump!

Copy link
Contributor

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seantalts both @rok-cesnovar and I looked this over and we think it looks good. If you have time to look it over tonight one more time that would be super rad but this pr being merged is holding up some other PRs so we'd like to merge it tmrw if you don't have time to look at it

@seantalts
Copy link
Member

Thanks @rybern, this is looking good, I like the refactors and think this is going in a good direction.
I understand how we got here and I see the value in the PR as it is today, but I don't get a good feeling from this PR having a couple of half-refactors in it. @rybern (or @SteveBronder) are you guys committed to cleaning up after? I think if this were in the old compiler or Stan Math someone would be asking for completeness and removal of old code. For the record, this form of compromise is how we ended up with the MIR using a radically different pattern from the AST, and I think we can all agree that the half-refactored state is the worst of both worlds.

At the minimum I think this should have the MIR test I asked for - I can add that this weekend if need be.

| Some expr ->
let vars = Set.Poly.map ~f:fst (expr_var_set expr) in
node_vars_dependencies info_map vars label
let vvars = Set.Poly.map ~f:fst (expr_var_set expr) in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it stand for something? I'm mostly asking because this used to be called var and now it's vvar and I'm not sure what else changed / what the semantics of the change are here

src/middle/Fun_kind.ml Outdated Show resolved Hide resolved
src/analysis_and_optimization/Monotone_framework.ml Outdated Show resolved Hide resolved
@seantalts
Copy link
Member

I'm trying to figure out how hard it would be to just do the unconstrain param refactor at the same time - @SteveBronder do you know if the Stan Math part of that is ready? Ideally we wouldn't leave stanc3 with constrains and unconstrains having totally separate code paths, MIR representations, and backend calls with all of the duplicated stuff that implies.

@rok-cesnovar
Copy link
Member

I am guessing this adds the serializer: stan-dev/stan#3019

rok-cesnovar added a commit that referenced this pull request Apr 2, 2021
Add a transformed_mir_diff expect test for #856 to diff against
@rybern
Copy link
Collaborator Author

rybern commented Apr 2, 2021

@seantalts Thank you for putting time into this!

I totally agree that it's uncomfortable to merge it in it's half state. This way just made sense given the plan for a followup PR.

The unconstrain piece is next up whether it's in this PR or a new PR, I just didn't want to write it as a new PR until this one got the all-clear. I don't think it'll take long especially if we have examples of what the output should look like.

Also, good call on making Fun_kind non-parametric for now. I imagine the internal function type could be a parameter of MIR later on so that backends can provide their own, but no need for now.

@SteveBronder
Copy link
Contributor

I'm trying to figure out how hard it would be to just do the unconstrain param refactor at the same time - @SteveBronder do you know if the Stan Math part of that is ready? Ideally we wouldn't leave stanc3 with constrains and unconstrains having totally separate code paths, MIR representations, and backend calls with all of the duplicated stuff that implies.

All the serializer and deserializer code is in Stan. The reason I wanted to break this up into two is because right now transform_inits() takes in a context__ and queries it for the data it needs (which stanc then has to unfold into the matrices, vectors, etc. So to use the serializer we need to change the transform inits signature to accept just a vector of data which is then passed to deserializer() (and then placed into serializer() using the free functions). But that also means changing things in the C++ to pass that correctly so we wanted to do the dual merge scheme where we merge stanc and the stan repo at the same time like we did with rvalue() and assign().

An alternative to the dual merge thing is to hard code the current transform inits method like we do with log_prob etc. In that method we would take all the data we need out of the context and put it into a vector, then pass that vector to the new transformed inits (we could just call it transform_inits_impl()). So putting

inline void transform_inits(const stan::io::var_context& context,
                         Eigen::Matrix<double, Eigen::Dynamic, 1>& params_r,
                         std::ostream* pstream = nullptr) const final {
        std::vector<std::string> constrain_param_names;
        context.names_r(constrained_param_names);
        std::vector<double> serialized_val_r();
        serialized_val_r.reserve(context.total_vals_r_size(), 0);
        size_t sentinal_iter = 0;
        for (size_t i = 0; i < constrained_param_names.size(); ++i) {
          std::vector<double> init_val = context.vals_r(constrained_param_names[i]);
          for (size_t j = 0; j < init_val.size(); ++j) {
            serialized_val_r[sentinal_iter] = init_val[i];
            sentinal_iter++;
          }
     transform_inits_impl(serialized_val_r, params_i, vars, pstream);
}

into the model. This would let us have backwards compatibility.

I'm fine with doing it either way, I have a branch of the Stan repo with the changes we would need to make for the dual merge here

@seantalts
Copy link
Member

Also, good call on making Fun_kind non-parametric for now. I imagine the internal function type could be a parameter of MIR later on so that backends can provide their own, but no need for now.

I realized that's what you were probably thinking after I pushed it up, haha. I think I agree it's slightly better to add the parameter when it's time to use it. Also I'm hoping I or someone else can take a crack at a LIR for Stan Math sometime, in which case it probably just has its own types.

So to use the serializer we need to change the transform inits signature

Doesn't that change the API and require bumping Stan's major version number? Or do we have some backwards-compatible route planned?

@seantalts
Copy link
Member

Sorry for the double post - what does "dual merge" refer to here? Splitting constrain and unconstrain across two PRs?

@SteveBronder
Copy link
Contributor

SteveBronder commented Apr 3, 2021

Doesn't that change the API and require bumping Stan's major version number? Or do we have some backwards-compatible route planned?

Oh I didn't know that would cause a version bump. I think we can do the backwards compatible version I posted above.

Sorry for the double post - what does "dual merge" refer to here? Splitting constrain and unconstrain across two PRs?

I didn't know what to call it so I just made that term up lol. It's like, you know when we have a PR in one of our repos that causes something to fail in the upstream repo so we make a branch in each repo with the changes, test both of those against one another, then accept and merge both. Like a good example is the rvalue() and assign()that prettied those up by removing the cons_index_list() stuff. We made a branch in Stan with the change, then Rok made a branch of the compiler, both were tested against one another, and then both were merged in. That's what I mean by "dual merge"

@seantalts
Copy link
Member

seantalts commented Apr 3, 2021

Oh I didn't know that would cause a version bump.

I think that transform_inits is part of the API used by RStan, PyStan, et al. I think this is basically true of all model class methods on Stan. Perhaps some day we'll all agree to have the API be that exposed by cmdstan, but I think for now that's still a pretty public facing endpoint. That said I haven't been to meetings a year or so, maybe things have changed. But it's good you have a backwards-compatible version ready.

I think re: "dual merge" we used to call them cross-repo features or PRs.

Is there anything going on in this space for Stan constraint checks? e.g. data that is declared as lower=0. Does anyone know if there are vectorized versions of that as well? Would be nice to delete that too from the MIR.

@seantalts
Copy link
Member

PS re: sentinel_iter: an "iterator sentinel" is a different thing than a loop index - a sentinel is some special value in your type that signals something, in this case that the end of iteration has been reached. I think this is a C++20 feature you can read about here: https://foonathan.net/2020/03/iterator-sentinel/. In the integers, negative numbers or MIN_INT etc are often used as a sentinel. This was more common when you didn't have access to a complex enough type system to represent things like option types.

@rok-cesnovar
Copy link
Member

Is there anything going on in this space for Stan constraint checks? e.g. data that is declared as lower=0. Does anyone know if there are vectorized versions of that as well? Would be nice to delete that too from the MIR.

Yes, there is, see #552

Also see the test file: https://github.com/stan-dev/stanc3/blob/master/test/integration/good/code-gen/transform.stan

@seantalts seantalts merged commit 282439e into stan-dev:master Apr 3, 2021
@seantalts
Copy link
Member

I'm fine with merging this now. Looking forward to the unconstrain version. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants