New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Sampling is able to initialize when there is an exception in transformed parameters #2570

Merged

bob-carpenter merged 10 commits into develop from bugfix/issue-2562-transformed-parameters

Jul 11, 2018

Member

syclik commented Jul 7, 2018 •

edited

Loading

Submission Checklist

Run unit tests: ./runTests.py src/test/unit
Run cpplint: make cpplint
Declare copyright holder and open-source license: see below

Summary

This pull request allows initialization of a Stan program to continue when there's a violation of constraints in the transformed parameters block (or a reject() statement). This fixes #2562, reverting behavior to pre-v2.17.

Intended Effect

This is meant to make running Stan easier. When there's an exception thrown, the initialization procedure treats that as a bad starting point and retries.

How to Verify

There are updated tests in src/test/unit/services/util/initialize_test.cpp. In the new tests, there is a mock model that throws in the write_array() method of the model, which is where the transformed parameters are validated. The new tests make sure that the initialization is retried 100 times.

Side Effects

Some additional things in the pull request:

updated .gitignore for *.gch (precompiled headers)
added new test to random_var_context_test.cpp to verify that random_var_context will now be constructed in a bad state.
whitespace has been updated for these files; ignoring the whitespace makes it easier to check the pull request for content.

Documentation

None needed.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):

Generable

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

syclik added 5 commits

July 6, 2018 16:13


          Updating .gitignore

ae33787


          updating whitespace


          adding failing test for issue #2562

344f82c


          Adding additional test for random_var_context that shows the current …

0e26281

…behavior


          Fixes #2562. Updating src/stan/services/util/initialize.hpp to handle…

b456e0e

… exception behavior in new generated code.

syclik changed the title ~~Sampling is able to initialize when there is an exception in transformed parrameters~~ Sampling is able to initialize when there is an exception in transformed parameters

Member

seantalts commented Jul 9, 2018

Travis was broken by something downstream in Math - submitted #2573 to fix it, sorry about that.


          Merge branch 'develop' of github.com:stan-dev/stan into bugfix/issue-…

f961325

…2562-transformed-parameters

bob-carpenter requested changes

View reviewed changes

.gitignore

@@ @@ -56,3 +56,5 @@ stan.kdev4 @@
               # local make include
               /make/local
+              *.gch

Member

bob-carpenter Jul 9, 2018

[optional]

Looks like this file is missing a newline. I don't know if that's a big deal or not.

Member Author

syclik Jul 10, 2018

fixed

src/stan/services/util/initialize.hpp

+               *
+               * When at least some of the initialization is random, it will
+               * randomly initialize until it finds a set of unconstrained
+               * parameters that are valid or it hits <code>MAX_INIT_TRIES =

Member

bob-carpenter Jul 9, 2018

What is MAX_INIT_TRIES? Is it something the user can set? If not, then I don't think it should be given a name like this that makes it look like a property that a user could set.

Member Author

syclik Jul 10, 2018

It's a holdout from old code. No, it's not something the user can set. I think the original intention was that it could be specified.

Any other suggestions for names? I'm reluctant to replace that with 100 just because it'll be harder to track down later.

Member

bob-carpenter Jul 11, 2018

It's the fact that there's what looks like a variable name there that makes it look like someone should be able to control it. Anyway, this is just doc, so no big deal.

src/stan/services/util/initialize.hpp Outdated

+               * 100</code> (hard-coded).
+               *
+               * Valid initialization is defined as a finite, non-NaN value
+               * for the evaluation of the log probability and all its

Member

bob-carpenter Jul 9, 2018

"log probability" -> "log density function" (or "log probability desnity function")

Member Author

syclik Jul 10, 2018

fixed

src/stan/services/util/initialize.hpp Outdated

-                      for (; num_init_tries < MAX_INIT_TRIES; num_init_tries++) {
-                        stan::io::random_var_context
-                          random_context(model, rng, init_radius, init_zero);
+                bool init_zero = init_radius <= std::numeric_limits<double>::min();

Member

bob-carpenter Jul 9, 2018

[required]

This needs documentation. The name init_zero also needs to change as it's not testing for zero, it's testing for smaller than smallest normalized floating point number. That means there are values other than 0 that this passes for.

[optional]

This could just change to == 0 and the only thing you'd miss is unnormalized small numbers and it wouldn't need any doc or name changes.

Member Author

syclik Jul 10, 2018

I'll leave an inline comment. There's also a function-level comment:

When the var_context provides all variables or

the init_radius is 0, this function will only evaluate the

log probability of the model with the unconstrained

parameters once to see if it's valid.
...

@param[in] init_radius the radius for generating random values.

A value of 0 indicates that the unconstrained parameters (not

provided by init) should be initialized with 0.

I'm sure I could doc it better; what documentation do I need to leave behind so this is clear?

Member Author

syclik Jul 10, 2018

I changed the name to is_initialized_with_zero and I think there isn't much I can do with an inline comment now. Just let me know if there's anything else that I can add to the doc.

src/stan/services/util/initialize.hpp

+                                            unconstrained,
+                                            &msg);
+                    }
+                  } catch (std::domain_error& e) {

Member

bob-carpenter Jul 9, 2018

Did someone take a pass to make sure we only throw domain errors when we want to reject and keeping going? I thought we still had Boost errors of various types and other kinds of std::exception like range errors and things that shouldn't stop sampling.

Member Author

syclik Jul 10, 2018

I did a while ago. After this catch, we catch all std::exception, which include Boost errors.

I haven't seen any behavior stop sampling where it shouldn't. Have you? Or have you seen it reported anywhere? We can take a pass over every throw if you want.

Member

bob-carpenter Jul 11, 2018

But it looks like only std::domain_error gets caught.

That should probably be a const, too (missed that the first time around, but not an issue if it works like this).

It's not taking passes over the throws I'm worried about in our library so much as the ones coming out of Boost or Eigen or CVODES. Or did we take complete control over errors so that we only get error codes and control all of our own throwing? It's so long ago that I was hoping you could remind me.

src/stan/services/util/initialize.hpp

+                    logger.info("");
+                    std::stringstream msg1;
+                    msg1 << "Gradient evaluation took " << deltaT << " seconds";
+                    logger.info(msg1);

Member

bob-carpenter Jul 9, 2018

[future request]

Our loggers should just take these, as in:

logger.info() << "Gradient evaluation took " << ...;

Or we should have a polyadic info function that could look like:

logger.info("Gradient evaluation took ", deltaT, " seconds");

Member Author

syclik Jul 10, 2018

I just created a new issue for this.

The first implementation was a compromise and was only meant to be there as a stepping stone. Now that it's in, it's a lot easier to change safely.

src/stan/services/util/initialize.hpp Outdated

-                      init_writer(unconstrained);
-                      return unconstrained;
-                    }
+                    logger.info("Adjust your expectations accordingly!");

Member

bob-carpenter Jul 9, 2018

[optional]
I strongly prefer logger.info("Adjust ... accordingly!\n\n"); without two extra empty lines. I really think we should get rid of empty logger messages.

Member Author

syclik Jul 10, 2018

I don't mind getting rid of empty lines, but we need the empty logger message. This is the behavior that allows RStan to print what chain messages came from. We assume each message is a single line and we don't have the loggers scan for \n to split on that. We could, but currently, we don't.

src/stan/services/util/initialize.hpp Outdated

+                    logger.info("");
+                    logger.info("");
+                  }
+                  if (gradient_ok)

Member

bob-carpenter Jul 9, 2018

This should return from here---there's no more work to do, then you won't need a confusing test in the next block.

if (gradient_ok) {
  init_writer(unconstrained);
  return unconstrained;
}

Then the rest of the logic simplifies.

src/stan/services/util/initialize.hpp Outdated


		if (num_init_tries == MAX_INIT_TRIES) {

Member

bob-carpenter Jul 9, 2018

Now you don't need to test any more. We've returned if we found a decent solution with finite gradient, right?

I think there should be a separate print for the init zero condition and the same tips should be given. So this all reduces to something like:

if (init_zero) {
  logger.info("initialization at zero fialed after 100 attempts");
} else {
  std::stringstream msg;
  msg << "Initialization between (-" << init_radius ...
  logger.info(msg);
}
msg << " Try specifying initial values,"
                    " reducing ranges of constrained values,"
                    " or reparameterizing the model.";
logger.info(msg);
throw std::domain_error("Initialization failed");

Member Author

syclik Jul 10, 2018

Indeed. That is much simpler.

src/stan/services/util/initialize.hpp Outdated

		}

		init_writer(unconstrained);

Member

bob-carpenter Jul 9, 2018

Does init_writer modify unconstrained? I have no idea what this is doing here, so some doc may help.

Member Author

syclik Jul 10, 2018

No, it does not. It just outputs the initialization. This is used for ADVI and it really is necessary if we want to think about restarting.

syclik mentioned this pull request

Clean up code: .str().length() #2577

Open

syclik added 2 commits

July 10, 2018 10:14


          adding newline

813c89e


          updating initialization code

055f60f

syclik mentioned this pull request

Redesign loggers #2579

Open


          simplifying logic

cacf1ff

Member Author

syclik commented Jul 10, 2018

@bob-carpenter, thanks for leaving the detailed comments! I fixed most of it. Some of it I deferred by creating new issues. I think some of it was legitimately outside the scope here.


          cpplint

83b3f78

Member

seantalts commented Jul 10, 2018

Looks like the tests are passing too. Once we get this in, I think I can release 2.18.

bob-carpenter approved these changes

View reviewed changes

bob-carpenter merged commit 98d7f41 into develop

bob-carpenter deleted the bugfix/issue-2562-transformed-parameters branch

July 11, 2018 02:28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet