Call new deserializer backend for constrained reads #856

rybern · 2021-03-12T18:34:58Z

Replacing #837

This is meant to be a minimal PR to call the new backend for constrained reads (merged backend PR here: stan-dev/stan#3013)

For example, what was

      sigma = in__.scalar();
      current_statement__ = 4;
      if (jacobian__) {
        current_statement__ = 4;
        sigma = stan::math::lb_constrain(sigma, 0, lp__);
      } else {
        current_statement__ = 4;
        sigma = stan::math::lb_constrain(sigma, 0);
      }

is now

      sigma = in__.read_lb<local_scalar_t__, jacobian__>(0, lp__);

The strategy is just to:

Stop generating FnConstrain in Ast_to_Mir
Pack the information necessary for the new constrained reads into FnRead, which is built in Transform_Mir
Change the codegen for FnRead to match the new interface

Work still to do:

Remove FnConstrain from Internal_fun
Make sure stanc3 test diffs are okay (I'll note a couple odd ones in line comments)
End-to-end compile tests

rok-cesnovar · 2021-03-12T18:44:49Z

End-to-end compile tests

Can you explain a bit more what you mean by this? Do you mean test that models compile? If so we have that test running in our Jenkins pipeline. The top one in the image below compiles all integration/good and all models in the stan-dev/example-models repo.

seantalts · 2021-03-12T19:15:58Z

I think the MIR should retain the separation of constrain and read operations - if we need to merge them for the Stan Math backend, we can do that in that back-end. Other backends don't combine those operations.

…

On Fri, Mar 12, 2021 at 1:49 PM Rok Češnovar ***@***.***> wrote: End-to-end compile tests Can you explain a bit more what you mean by this? Do you mean test that models compile? If so we have that test running in our Jenkins pipeline. The top one in the image below compiles all integration/good and all models in the stan-dev/example-models repo. [image: image] <https://user-images.githubusercontent.com/28476796/110984270-32aefc00-836b-11eb-8a1d-18ec14962a15.png> — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#856 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGET3AKFYSYRQY5V3SKZILTDJO2FANCNFSM4ZCYJ5UA> .

SteveBronder · 2021-03-12T19:51:24Z

Looking over the code now, one note, you need to change the line here

to

      ; "stan::io::deserializer<local_scalar_t__> in__(params_r__, params_i__);"

SteveBronder · 2021-03-12T19:56:20Z

test/integration/good/code-gen/cl.expected

-      alpha_v = in__.vector(k);
+      assign(alpha_v, nil_index_list(),
+        in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__,
+          k), "assigning variable alpha_v");


So I can change this if in deserializer if it's a lot easier on the compiler but this line

in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>, jacobian__>(lp__, k);

Should just be

in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);

As jacobian calcs are only needed when calling constrains I made the read() methods not take in the jacobian template value or lp.

Also we can shorten all of this up to remove the fill etc. for parameters and just have

Eigen::Matrix<local_scalar_t__, -1, 1> alpha_v = in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(k);

Ah I missed that thanks, shouldn't be hard.

Yeah, we should get rid of the fills, but I think that'll be a separate solution

Yeah that could be a different matter. I think can check for Decl and do something with that

…calls

rybern · 2021-03-12T20:58:14Z

Hi @seantalts - couple of points, IMO:

The transformed MIR is already backend-specific, so it's okay if we tailor it to the Stan Math backend. Other backends can read the transformations from out_vars and generate constraints in their own way.

Also, I think it's much easier to read a single MIR node and produce multiple codegen statements than it is to parse multiple MIR nodes back together into a single codegen statement (though again, this shouldn't be a problem because they're inserted in transform_mir).

rybern · 2021-03-12T20:59:53Z

Thanks @rok-cesnovar, I hoping that was the case

seantalts · 2021-03-12T21:09:22Z

The transformed MIR is already backend-specific, so it's okay if we tailor it to the Stan Math backend. Other backends can read the transformations from out_vars and generate constraints in their own way.

Agreed :) I feel like there are a few major projects underway in that area but once things settle down I'd be happy to take a crack at adding a Stan Math LIR form.

I was (over?)reacting to

Stop generating FnConstrain in Ast_to_Mir

How will constraints be represented in the MIR pre-Stan_math_backend.Transform_mir?

SteveBronder · 2021-03-12T21:18:44Z

test/integration/good/code-gen/opencl/cpp.expected

@@ -6729,38 +6744,43 @@ class restricted_model final : public model_base_crtp<restricted_model> {
      p_real = std::numeric_limits<double>::quiet_NaN();

      current_statement__ = 1;
-      p_real = in__.scalar();
+      p_real = in__.read<local_scalar_t__>(lp__);


Close! Just gotta get rid of the lp__ for the read() methods

rybern · 2021-03-12T21:22:04Z

Yeah an LIR might be really nice - I wonder if we could get rid of the CompilerInternal calls altogether

How will constraints be represented in the MIR pre-Stan_math_backend.Transform_mir?

I think it's enough to represent constraints as the transformations in output_vars, I don't think they need to be statements in untransformed MIR

SteveBronder · 2021-03-12T21:26:18Z

src/stan_math_backend/Expression_gen.ml

+    | {Expr.Fixed.pattern= Lit (Str, constraint_string); meta= emeta}
+      :: {Expr.Fixed.pattern= Lit (Int, n_constraint_args_str); _} :: args ->
+        let n_constraint_args = int_of_string n_constraint_args_str in
+        let constraint_args, dims = List.split_n args n_constraint_args in
+        let lp_expr = Expr.Fixed.{pattern= Var "lp__"; meta= emeta} in
+        let arg_exprs = constraint_args @ [lp_expr] @ dims in
+        let maybe_constraint, maybe_jacobian =
+          if String.is_empty constraint_string then ("", "")
+          else ("_" ^ constraint_string, ", jacobian__")
+        in


I think up here for not having lp__ for plain reads deserializer.read<>(dims...) you want to check if constrain_string is blank (or if constrain_args is size 0? either or) and if not then don't include lp__ as an arg

SteveBronder · 2021-03-12T21:30:16Z

src/stan_math_backend/Expression_gen.ml

+          if String.is_empty constraint_string then ("", "")
+          else ("_" ^ constraint_string, ", jacobian__")
+        in


Also for here @bbbales2 and I are talking about changing the methods names to

read() : just plain read (this one stays the same) read_{constrain_name}() -> read_constrain_{constrain_name}() : reading in variables that need a constrain

to match the signature for the read free functions

read_free_{constrain_name}() : reading in variables that need a free transform

But that would just involve changing this line to

Suggested change

if String.is_empty constraint_string then ("", "")

else ("_" ^ constraint_string, ", jacobian__")

in

if String.is_empty constraint_string then ("", "")

else ("_constrain_" ^ constraint_string, ", jacobian__")

in

We should have that sorted by today

Sounds good, lemme know

seantalts · 2021-03-12T21:48:56Z

I think it's enough to represent constraints as the transformations in output_vars, I don't think they need to be statements in untransformed MIR

In most backends we'd actually call a function here. In general the MIR is supposed to be lowering us towards procedural code as much as possible while being general across backends. It can retain higher level information but not rely on it out-of-band like that - you want everything you need to generate code to be there at your MIR node.

rybern · 2021-03-13T01:38:44Z

I see where you're coming from. In this case, I think we're being as low level as possible while being backend agnostic - it wouldn't be backend agnostic to put in a separate Constrain call, because the Stan Math backend would have to undo that work.

In a sense, it's the Decl nodes that represent the constraints in the MIR statements, because we're expanding the Decls just like we've been doing for reads. Just like we've been doing for reads, we're expanding parameter Decls with information from output_vars.

To put more of the information inline into the MIR, what do you think about adding transformation option to Decl? Or maybe further, (constrainaction * transformation) option, so that backends don't need to know where to put Check/Constrain/Unconstrain? Then backends are free to generate whatever read or constrain calls they need to without looking at output_vars. On the one hand, I like that because the information is inline but can't be separated from the declaration, on the other hand, I don't want to do away with output_vars but I also don't like redundant representations that can get out of sync...

SteveBronder · 2021-03-18T15:57:34Z

@rybern fyi I pushed up the just changing reader to deserializer in the gen code and promoting the tests

rybern · 2021-03-18T16:00:08Z

@SteveBronder great - I'll be free to help finish up this afternoon

…stexpr bool jacobian__ to write_array_impl, change name of read_cholesky_corr to read_cholesky_factor_corr

SteveBronder · 2021-03-18T17:10:51Z

test/integration/good/code-gen/cpp.expected

+      assign(L_Omega,
+        in__.template read_constrain_cholesky_factor_corr<std::vector<Eigen::Matrix<local_scalar_t__, -1, -1>>, jacobian__>(
+          lp__, nt, 2, 2), "assigning variable L_Omega");


Alright so fixing this bit here got param-constrain.stan to compile for me (after we add the little fix in this PR here since we forgot to include the deserializer header lol)

The model is

data { int nt; int NS; } parameters { cholesky_factor_corr[2] L_Omega[nt]; vector<lower=L_Omega[1,1,2]>[NS] z1; }

Is making the nt, 2, 2) where the deserializer signature the dimensions to just be nt, 2) (here) since then we grab a lower triangular of data with # of cells equal to (K * (K - 1)) / 2). I think this is caused by get_dims in the transform mir stuff for let read = ...?

https://github.com/stan-dev/stanc3/pull/856/files#diff-a1f5ff85d2b828c19a91e01d5c1e8caaf75c47e8af9bb205a3e0ed486a7e064fR233

Oooo, it's possible that we could have a problem somewhere. cholesky_factor_cov supports both: cholesky_factor_cov[M] and cholesky_factor_cov[M, N] (according to here). I think it's the only matrix parameter like this (that has a non-square version), so watch out.

And I say it's possible there could be a problem not cause I know -- just cause it is a little different and but sounds related to what you're talking about.

We also do this pattern for read_constrain/free_cholesky_factor_corr, read_constrain/free_cov_matrix, and read_constrain/free_corr_matrix.

Aight I just pushed something up that I think will gen the thing we want

3eaab9e

Oh ooff Ben just seeing your comment. I think that's fine? The signature for read_constrain_cholesky_factor_cov is

read_constrain_cholesky_factor_cov(LP& lp, Eigen::Index M, Eigen::Index N)

And in the thing I wrote we handle that the normal way, so hopefully should be fine

SteveBronder · 2021-03-31T21:48:00Z

@seantalts lil' bump!

SteveBronder

@seantalts both @rok-cesnovar and I looked this over and we think it looks good. If you have time to look it over tonight one more time that would be super rad but this pr being merged is holding up some other PRs so we'd like to merge it tmrw if you don't have time to look at it

seantalts · 2021-04-02T01:40:02Z

Thanks @rybern, this is looking good, I like the refactors and think this is going in a good direction.
I understand how we got here and I see the value in the PR as it is today, but I don't get a good feeling from this PR having a couple of half-refactors in it. @rybern (or @SteveBronder) are you guys committed to cleaning up after? I think if this were in the old compiler or Stan Math someone would be asking for completeness and removal of old code. For the record, this form of compromise is how we ended up with the MIR using a radically different pattern from the AST, and I think we can all agree that the half-refactored state is the worst of both worlds.

At the minimum I think this should have the MIR test I asked for - I can add that this weekend if need be.

seantalts · 2021-04-02T01:19:44Z

src/analysis_and_optimization/Pedantic_analysis.ml

    | Some expr ->
-        let vars = Set.Poly.map ~f:fst (expr_var_set expr) in
-        node_vars_dependencies info_map vars label
+        let vvars = Set.Poly.map ~f:fst (expr_var_set expr) in


Does it stand for something? I'm mostly asking because this used to be called var and now it's vvar and I'm not sure what else changed / what the semantics of the change are here

src/middle/Fun_kind.ml

src/analysis_and_optimization/Monotone_framework.ml

seantalts · 2021-04-02T18:13:23Z

I'm trying to figure out how hard it would be to just do the unconstrain param refactor at the same time - @SteveBronder do you know if the Stan Math part of that is ready? Ideally we wouldn't leave stanc3 with constrains and unconstrains having totally separate code paths, MIR representations, and backend calls with all of the duplicated stuff that implies.

rok-cesnovar · 2021-04-02T18:17:08Z

I am guessing this adds the serializer: stan-dev/stan#3019

Add a transformed_mir_diff expect test for #856 to diff against

rybern · 2021-04-02T20:08:39Z

@seantalts Thank you for putting time into this!

I totally agree that it's uncomfortable to merge it in it's half state. This way just made sense given the plan for a followup PR.

The unconstrain piece is next up whether it's in this PR or a new PR, I just didn't want to write it as a new PR until this one got the all-clear. I don't think it'll take long especially if we have examples of what the output should look like.

Also, good call on making Fun_kind non-parametric for now. I imagine the internal function type could be a parameter of MIR later on so that backends can provide their own, but no need for now.

SteveBronder · 2021-04-03T00:22:56Z

I'm trying to figure out how hard it would be to just do the unconstrain param refactor at the same time - @SteveBronder do you know if the Stan Math part of that is ready? Ideally we wouldn't leave stanc3 with constrains and unconstrains having totally separate code paths, MIR representations, and backend calls with all of the duplicated stuff that implies.

All the serializer and deserializer code is in Stan. The reason I wanted to break this up into two is because right now transform_inits() takes in a context__ and queries it for the data it needs (which stanc then has to unfold into the matrices, vectors, etc. So to use the serializer we need to change the transform inits signature to accept just a vector of data which is then passed to deserializer() (and then placed into serializer() using the free functions). But that also means changing things in the C++ to pass that correctly so we wanted to do the dual merge scheme where we merge stanc and the stan repo at the same time like we did with rvalue() and assign().

An alternative to the dual merge thing is to hard code the current transform inits method like we do with log_prob etc. In that method we would take all the data we need out of the context and put it into a vector, then pass that vector to the new transformed inits (we could just call it transform_inits_impl()). So putting

inline void transform_inits(const stan::io::var_context& context,
                         Eigen::Matrix<double, Eigen::Dynamic, 1>& params_r,
                         std::ostream* pstream = nullptr) const final {
        std::vector<std::string> constrain_param_names;
        context.names_r(constrained_param_names);
        std::vector<double> serialized_val_r();
        serialized_val_r.reserve(context.total_vals_r_size(), 0);
        size_t sentinal_iter = 0;
        for (size_t i = 0; i < constrained_param_names.size(); ++i) {
          std::vector<double> init_val = context.vals_r(constrained_param_names[i]);
          for (size_t j = 0; j < init_val.size(); ++j) {
            serialized_val_r[sentinal_iter] = init_val[i];
            sentinal_iter++;
          }
     transform_inits_impl(serialized_val_r, params_i, vars, pstream);
}

into the model. This would let us have backwards compatibility.

I'm fine with doing it either way, I have a branch of the Stan repo with the changes we would need to make for the dual merge here

seantalts · 2021-04-03T00:39:33Z

Also, good call on making Fun_kind non-parametric for now. I imagine the internal function type could be a parameter of MIR later on so that backends can provide their own, but no need for now.

I realized that's what you were probably thinking after I pushed it up, haha. I think I agree it's slightly better to add the parameter when it's time to use it. Also I'm hoping I or someone else can take a crack at a LIR for Stan Math sometime, in which case it probably just has its own types.

So to use the serializer we need to change the transform inits signature

Doesn't that change the API and require bumping Stan's major version number? Or do we have some backwards-compatible route planned?

seantalts · 2021-04-03T00:41:03Z

Sorry for the double post - what does "dual merge" refer to here? Splitting constrain and unconstrain across two PRs?

SteveBronder · 2021-04-03T00:49:21Z

Doesn't that change the API and require bumping Stan's major version number? Or do we have some backwards-compatible route planned?

Oh I didn't know that would cause a version bump. I think we can do the backwards compatible version I posted above.

Sorry for the double post - what does "dual merge" refer to here? Splitting constrain and unconstrain across two PRs?

I didn't know what to call it so I just made that term up lol. It's like, you know when we have a PR in one of our repos that causes something to fail in the upstream repo so we make a branch in each repo with the changes, test both of those against one another, then accept and merge both. Like a good example is the rvalue() and assign()that prettied those up by removing the cons_index_list() stuff. We made a branch in Stan with the change, then Rok made a branch of the compiler, both were tested against one another, and then both were merged in. That's what I mean by "dual merge"

seantalts · 2021-04-03T12:31:28Z

Oh I didn't know that would cause a version bump.

I think that transform_inits is part of the API used by RStan, PyStan, et al. I think this is basically true of all model class methods on Stan. Perhaps some day we'll all agree to have the API be that exposed by cmdstan, but I think for now that's still a pretty public facing endpoint. That said I haven't been to meetings a year or so, maybe things have changed. But it's good you have a backwards-compatible version ready.

I think re: "dual merge" we used to call them cross-repo features or PRs.

Is there anything going on in this space for Stan constraint checks? e.g. data that is declared as lower=0. Does anyone know if there are vectorized versions of that as well? Would be nice to delete that too from the MIR.

…efactor-2

seantalts · 2021-04-03T12:44:18Z

PS re: sentinel_iter: an "iterator sentinel" is a different thing than a loop index - a sentinel is some special value in your type that signals something, in this case that the end of iteration has been reached. I think this is a C++20 feature you can read about here: https://foonathan.net/2020/03/iterator-sentinel/. In the integers, negative numbers or MIN_INT etc are often used as a sentinel. This was more common when you didn't have access to a complex enough type system to represent things like option types.

rok-cesnovar · 2021-04-03T12:49:34Z

Is there anything going on in this space for Stan constraint checks? e.g. data that is declared as lower=0. Does anyone know if there are vectorized versions of that as well? Would be nice to delete that too from the MIR.

Yes, there is, see #552

Also see the test file: https://github.com/stan-dev/stanc3/blob/master/test/integration/good/code-gen/transform.stan

seantalts · 2021-04-03T18:25:41Z

I'm fine with merging this now. Looking forward to the unconstrain version. Thanks all!

rybern added 4 commits March 12, 2021 12:18

Convert reads to constrained reads; remove constraints

66bf994

Handling constraint argument codegen correctly (with hack)

8623456

remove old comments

6ce419d

merge master

9f6836b

rybern mentioned this pull request Mar 12, 2021

Refactoring and improving constraint codegen #837

Closed

rybern requested review from seantalts and SteveBronder March 12, 2021 18:38

SteveBronder reviewed Mar 12, 2021

View reviewed changes

Change in__ from reader to deserializer; remove jacobian from read() …

d045c45

…calls

SteveBronder reviewed Mar 12, 2021

View reviewed changes

Don't include lp__ in unconstrainted reads

03e789e

SteveBronder reviewed Mar 12, 2021

View reviewed changes

rybern and others added 3 commits March 17, 2021 15:59

merge master

ce092c3

update to deserializer interface

c28d4ab

change reader to deserializer in printed C++ code

7a601ac

use template keyword when calling read_constrain functions, add a con…

0e0fecb

…stexpr bool jacobian__ to write_array_impl, change name of read_cholesky_corr to read_cholesky_factor_corr

SteveBronder reviewed Mar 18, 2021

View reviewed changes

add little func for getting the weird constrain dimensions for matrices

3eaab9e

SteveBronder approved these changes Apr 1, 2021

View reviewed changes

seantalts added a commit that referenced this pull request Apr 2, 2021

Add a transformed_mir_diff expect test for #856 to diff against

d309d01

seantalts added a commit that referenced this pull request Apr 2, 2021

Add a transformed_mir_diff expect test for #856 to diff against

f5b2be2

seantalts added a commit that referenced this pull request Apr 2, 2021

Add a transformed_mir_diff expect test for #856 to diff against

b33c37d

Unify NRFunApp with FunApp monotone free vars

af670fa

seantalts reviewed Apr 2, 2021

View reviewed changes

Make Fun_kind.t non-parametric

171edb3

tiny reformat

21154d4

rok-cesnovar added a commit that referenced this pull request Apr 2, 2021

Merge pull request #867 from stan-dev/transformed-mir-expect-test

036d930

Add a transformed_mir_diff expect test for #856 to diff against

seantalts added 2 commits April 3, 2021 08:32

Merge branch 'master' of github.com:stan-dev/stanc3 into constraint-r…

38ba10b

…efactor-2

Promote transformed MIR expect tests

b8e666e

seantalts approved these changes Apr 3, 2021

View reviewed changes

seantalts merged commit 282439e into stan-dev:master Apr 3, 2021

rybern mentioned this pull request Apr 5, 2021

Refactor unconstraining to use deserializer interface #872

Merged

rok-cesnovar mentioned this pull request May 20, 2021

Performance regression in 2.27.0 RC stan-dev/math#2493

Closed

rybern mentioned this pull request Jan 12, 2022

[Outdated] Add tuples to the language #675

Closed

WardBrian mentioned this pull request Feb 9, 2022

Generate all Fn* internal_fn variant FunApps as CompilerInternal #151

Closed

Call new deserializer backend for constrained reads #856

Call new deserializer backend for constrained reads #856

Conversation

rybern commented Mar 12, 2021

rok-cesnovar commented Mar 12, 2021

seantalts commented Mar 12, 2021 via email

SteveBronder commented Mar 12, 2021

SteveBronder Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rybern commented Mar 12, 2021

rybern commented Mar 12, 2021

seantalts commented Mar 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rybern commented Mar 12, 2021

Choose a reason for hiding this comment

SteveBronder Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seantalts commented Mar 12, 2021

rybern commented Mar 13, 2021

SteveBronder commented Mar 18, 2021

rybern commented Mar 18, 2021

SteveBronder Mar 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveBronder commented Mar 31, 2021

SteveBronder left a comment

Choose a reason for hiding this comment

seantalts commented Apr 2, 2021

Choose a reason for hiding this comment

seantalts commented Apr 2, 2021

rok-cesnovar commented Apr 2, 2021

rybern commented Apr 2, 2021

SteveBronder commented Apr 3, 2021

seantalts commented Apr 3, 2021

seantalts commented Apr 3, 2021

SteveBronder commented Apr 3, 2021 • edited Loading

seantalts commented Apr 3, 2021 • edited Loading

seantalts commented Apr 3, 2021

rok-cesnovar commented Apr 3, 2021

seantalts commented Apr 3, 2021

SteveBronder Mar 12, 2021 •

edited

Loading

SteveBronder Mar 12, 2021 •

edited

Loading

SteveBronder Mar 18, 2021 •

edited

Loading

SteveBronder commented Apr 3, 2021 •

edited

Loading

seantalts commented Apr 3, 2021 •

edited

Loading