Test model logp before starting any MCMC chains #4211

StephenHogg · 2020-11-09T11:59:53Z

This PR addresses #4116 - making find_MAP and sample check their starting conditions before running any chains. I probably need to work out what the linting settings this repo uses are because it seems like a fair bit of formatting has changed.

michaelosthege · 2020-11-09T12:17:41Z

I'm sorry, but can you do the PR without applying black first? I appreciate the intent & it's fine to do in the end, but at this point it makes pinpointing the relevant changes close to impossible..

MarcoGorelli · 2020-11-09T12:30:35Z

I'm sorry, but can you do the PR without applying black first? I appreciate the intent & it's fine to do in the end, but at this point it makes pinpointing the relevant changes close to impossible..

I think the opposite has happened - black has already been applied to the entire codebase, it looks like here they've reverted some of its changes - e.g., black wouldn't do this:

+ from .util import (chains_and_samples, dataset_to_point_dict, get_default_varnames, get_untransformed_name,
+                    is_transformed_name, update_start_vals)

@StephenHogg please see the Python Style guide for this repo

StephenHogg · 2020-11-10T07:17:25Z

Sorry about this - had auto-linting on in my GUI and didn't realise. Have a look now, hopefully it's clearer.

michaelosthege · 2020-11-10T08:12:33Z

pymc3/sampling.py

+            for chain_start_vals in start:
+                update_start_vals(chain_start_vals, model.test_point, model)
+
+    start_points = [start] if isinstance(start, dict) else start


If I remember correctly, the downstream code treats a list of start as "start points for each chain", which could explain your index error.

StephenHogg · 2020-11-10T08:16:30Z

yeah - the start points object doesn't get used further down, it's just for checking the initial conditions in a way that makes handling both the case when it's a dictionary and the case when it's an array easy. `Start` is still what gets fed into functions later on, hence my confusion haha

…

On Tue, Nov 10, 2020, 19:12 Michael Osthege ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In pymc3/sampling.py <#4211 (comment)>: > @@ -419,6 +419,29 @@ def sample( """ model = modelcontext(model) + if start is None: + start = model.test_point + else: + if isinstance(start, dict): + update_start_vals(start, model.test_point, model) + else: + for chain_start_vals in start: + update_start_vals(chain_start_vals, model.test_point, model) + + start_points = [start] if isinstance(start, dict) else start If I remember correctly, the downstream code treats a list of start as "start points for each chain", which could explain your index error. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4211 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADFQOPVG3DUPEL5A43RSBIDSPDYYBANCNFSM4TPH3KGA> .

michaelosthege · 2020-11-10T08:16:30Z

The error you posted in #4116 could also be a cause of invalid model test points. Could be that not all distributions have tests points.
Try to copy the model from the failing test case into a notebook to inspect it.

StephenHogg · 2020-11-10T22:39:15Z

Just for clarity - what's the path forward here? Sorry for the bother

ColCarroll

I think running this check is generally a good idea, but I think it needs to be put in at the right place. If you look at sampling.py, this change will get rid of a bunch of downstream code (there are some if start is None, but I don't think in a great way.

What if we had a _check_start_point function, and it got called at the end of init_nuts? I think it would contain model.check_test_point, and the nice error messages, but it would not do anything to the start argument passed to it.

pymc3/sampling.py

pymc3/tuning/starting.py

ColCarroll · 2020-11-10T23:10:06Z

Sorry, I just read the attached issue, and it seems like that was steering @StephenHogg to put the changes where they are. Interested to hear if what you and @michaelosthege think!

StephenHogg · 2020-11-10T23:30:24Z

As a first time contributor I defer to Michael! :)

michaelosthege · 2020-11-11T11:06:40Z

As a first time contributor I defer to Michael! :)

I think Colin is right: The block could easily become its own function. That also makes it easier to test, or improve.

StephenHogg · 2020-11-11T11:32:49Z

Ok - are you also saying the new function should be called at the end of init_nuts instead of where it is now?

michaelosthege · 2020-11-11T11:56:18Z

Ok - are you also saying the new function should be called at the end of init_nuts instead of where it is now?

No, the NUTS initialization often suffers from inf/NaN and I think it's more useful to check that before initialization.

ColCarroll · 2020-11-11T12:17:50Z

The "main path" logic right now is:

start goes into the function
init_nuts is called, and comes up with its own start_
- usually this is by taking the test point, and adding Uniform(-1, 1) noise
- start_ in init_nuts is used to initialize a mean for a running variance estimate
start_ is overwritten in the main loop whenever start is not None

I think Michael's right that start_ also needs to be checked, since it is secretly being used internally (which is bad, but shouldn't be fixed here). But looking at the code, I think it can be done right after returning from init_nuts, otherwise there would be a ton of checks.

Concretely, I'm suggesting

a function that checks a user-supplied start value very early (around line 460, where there's already a check on start)
again checking start_ after it returns from init_nuts (around line 490)
perhaps in a followup PR, having the function check other initialization schemes (there's an else branch for discrete models I've been ignoring...)

StephenHogg · 2020-11-11T21:37:23Z

@michaelosthege any more thoughts on the above? Would like to make sure I'm clear about what I'm coding up before starting again

michaelosthege · 2020-11-11T22:30:03Z

@StephenHogg listen to Colin on this one. He's much more literate in what the NUTS code is actually doing. With those checks in their own function, you can run them before & after NUTS initialization.
It costs a few model evaluations, but hey, this is MCMC sampling and it gives us more interpretable errors.

StephenHogg · 2020-11-14T04:30:30Z

I've shifted this into a function called pm.util.check_start_vals. In the process, I've spotted (and fixed) a bug in model.check_test_point. The function's test_point argument was being ignored at all times.

StephenHogg · 2020-11-14T05:00:38Z

Looking at the test output, it seems like a few other tests (e.g. test_nuts_error_reporting in test_hmc.py) are actually broken by the changes in this branch. Any guidance as to how to handle this appreciated.

StephenHogg · 2020-11-14T11:26:37Z

Here's the output I get from pytest at this point, if that helps. Some of these are a bit mystifying, as I'm not sure why I'd be getting a max recursion depth error on a test that I've not touched, for instance. Will push one more change to format the error string a bit more nicely, but after that I think I'm probably stuck for now.

ColCarroll

this looks nice! I took a look at most of the the test failures, and they're surprisingly helpful. Feel free to ping again if you need more help, but I think this is close:

Delete test_hmc.py:test_nuts_error_reporting. your check is a better one for the same behavior.
test_sampling.py:test_deterministic_of_observed looks like a flake. let's ignore that and hope it goes away. if it doesn't, make the rtol bigger.
test_examples.py::TestLatentOccupancy::test_run is interesting, and looks like a legit failure you found! In this case, the likelihood is passing parameters in the wrong order. It should be
pm.ZeroInflatedPoisson("y", psi, theta, observed=y) (note that psi and theta are switched). I imagine it was passing because the multipart sampling got everything to a reasonable place.
Two failures in pymc3/tests/test_step.py can also be either deleted, or ported to the new exception you throw -- it looks like we have a SamplingError defined, which may be a good, specific error to raise instead of a ValueError.

pymc3/util.py

StephenHogg · 2020-11-15T06:59:48Z

The only thing still failing at this point is one test in test_step.py, which is because it is expecting a ParallelSamplingError but the new check is just returning a SamplingError. I'm happy to shift this over, but wanted to check I'm not violating the spirit of this test?

Edit: the flaky test is also not passing, but that definitely passes locally

StephenHogg · 2020-11-17T02:11:37Z

Hi @ColCarroll - the only thing that still fails now is test_sampling.py:test_deterministic_of_observed when FLOATX is set to float32 (doesn't appear to be a problem with 64 bit floats?). I can increase the rtol here but it's already 1e-3, got any preference?

ColCarroll · 2020-11-17T18:22:09Z

This looks great! What if you loosen the tolerances on the test, but also open a bug and mention that it got worse when this PR was merged? That's very strange...

I think the last two things are:

Remove draft status of this pr
Add a line to RELEASE_NOTES.md

codecov · 2020-11-19T13:40:42Z

Codecov Report

Merging #4211 (ddf9fc3) into master (2723b6c) will decrease coverage by 0.18%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4211      +/-   ##
==========================================
- Coverage   88.14%   87.95%   -0.19%     
==========================================
  Files          87       87              
  Lines       14243    14248       +5     
==========================================
- Hits        12554    12532      -22     
- Misses       1689     1716      +27

Impacted Files	Coverage Δ
pymc3/model.py	`89.05% <ø> (-0.12%)`	⬇️
pymc3/sampling.py	`85.39% <100.00%> (-1.10%)`	⬇️
pymc3/tuning/starting.py	`83.72% <100.00%> (+1.90%)`	⬆️
pymc3/step_methods/hmc/base_hmc.py	`90.83% <0.00%> (-7.50%)`	⬇️
pymc3/backends/report.py	`90.90% <0.00%> (-2.10%)`	⬇️
pymc3/parallel_sampling.py	`86.79% <0.00%> (-1.58%)`	⬇️
pymc3/step_methods/hmc/quadpotential.py	`79.03% <0.00%> (-0.54%)`	⬇️

twiecki · 2020-11-25T11:11:00Z

Seems like there are still conflicts though:

StephenHogg · 2020-11-25T11:13:00Z

Seems like there are still conflicts though:

Yes, that's what I'm saying - I can either leave the conflict in, in which case I can't merge, or I can resolve the conflict in which case linting fails because there's an unneeded import. It's a Catch-22.

pymc3/tests/test_hmc.py

pymc3/util.py

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

MarcoGorelli · 2020-11-25T13:29:29Z

Seems like there are still conflicts though:

Yes, that's what I'm saying - I can either leave the conflict in, in which case I can't merge, or I can resolve the conflict in which case linting fails because there's an unneeded import. It's a Catch-22.

Shouldn't be a catch-22 😄

Can you try

git fetch --all --prune
git merge upstream/master

Then, in pymc3/util.py, you'll see something like

<<<<<<< HEAD
=======
from numpy import ndarray
>>>>>>> upstream/master

Change it to

(i.e., choose the current changes, ignore the incoming ones)

Then,

git add -u
git commit
git push -u origin HEAD

for more on git, I heartily recommend the pro git book

StephenHogg · 2020-11-25T22:49:23Z

As before - there's a mysterious new test failure

ColCarroll · 2020-11-26T13:42:49Z

Wow, CI got changed under you!

test_examples.py::TestLatentOccupancy::test_run got reverted during your merge, I think: pm.ZeroInflatedPoisson("y", psi, theta, observed=y) should be pm.ZeroInflatedPoisson("y", psi, theta, observed=y)

StephenHogg · 2020-11-26T23:28:23Z

This new error doesn't seem to have much to do with the code I wrote? Not sure, though

MarcoGorelli · 2020-11-27T08:21:13Z

This new error doesn't seem to have much to do with the code I wrote? Not sure, though

can you check if that test passes when you run it locally?

pytest pymc3/tests/test_step.py::TestMLDA::test_acceptance_rate_against_coarseness

StephenHogg · 2020-11-27T09:51:36Z

@MarcoGorelli passes locally, had to update theano-pymc to version 1.0.11 to do it. Here's the output:

(pymc3) shogg@192:~/git/pymc3$ pytest pymc3/tests/test_step.py::TestMLDA::test_acceptance_rate_against_coarseness
================================================================================================================================== test session starts ==================================================================================================================================
platform darwin -- Python 3.7.5, pytest-5.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/shogg/git/pymc3, inifile: setup.cfg
collected 1 item                                                                                                                                                                                                                                                                        

pymc3/tests/test_step.py .                                                                                                                                                                                                                                                        [100%]

=================================================================================================================================== warnings summary ====================================================================================================================================
pymc3/tests/test_step.py::TestMLDA::test_acceptance_rate_against_coarseness
pymc3/tests/test_step.py::TestMLDA::test_acceptance_rate_against_coarseness
pymc3/tests/test_step.py::TestMLDA::test_acceptance_rate_against_coarseness
  /Users/shogg/git/pymc3/pymc3/step_methods/mlda.py:383: UserWarning: The MLDA implementation in PyMC3 is still immature. You should be particularly critical of its results.
    "The MLDA implementation in PyMC3 is still immature. You should be particularly critical of its results."

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================================================================================= 1 passed, 3 warnings in 13.26 seconds =========================================================================================================================

StephenHogg · 2020-11-27T09:52:29Z

Wait, all checks have passed now? Maybe the test was flaky?

twiecki · 2020-11-27T09:54:44Z

Yes, I think so. Thanks @StephenHogg!

StephenHogg · 2020-11-27T09:56:10Z

Whew, thanks

ColCarroll · 2020-11-28T00:32:37Z

+1 Thanks for sticking with us, @StephenHogg -- this was trickier than expected, but I think it will really improve lots of people's experiences.

* - Fix regression caused by #4211 * - Add test to make sure jitter is being applied to chains starting points by default * - Import appropriate empty context for python < 3.7 * - Apply black formatting * - Change the second check_start_vals to explicitly run on the newly assigned start variable. * - Improve test documentation and add a new condition * Use monkeypatch for more robust test * - Black formatting, once again...

twiecki changed the title ~~PR to fix #4116~~ Test model logp before starting any MCMC chains Nov 9, 2020

michaelosthege reviewed Nov 10, 2020

View reviewed changes

michaelosthege mentioned this pull request Nov 10, 2020

Fixing scalar shape handling: treat (1,) as vectors #4214

Merged

6 tasks

ColCarroll reviewed Nov 10, 2020

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

pymc3/tuning/starting.py Outdated Show resolved Hide resolved

michaelosthege added the enhancements label Nov 12, 2020

michaelosthege added the help wanted label Nov 14, 2020

michaelosthege added this to the 3.10 milestone Nov 14, 2020

ColCarroll reviewed Nov 14, 2020

View reviewed changes

pymc3/util.py Outdated Show resolved Hide resolved

StephenHogg marked this pull request as ready for review November 19, 2020 22:19

twiecki reviewed Nov 25, 2020

View reviewed changes

pymc3/tests/test_hmc.py Show resolved Hide resolved

twiecki reviewed Nov 25, 2020

View reviewed changes

pymc3/util.py Show resolved Hide resolved

Update pymc3/util.py as per twiecki

9d3e0ce

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

Stephen Hogg added 2 commits November 26, 2020 09:12

Merge remote-tracking branch 'pymcdevs/master' into 4116

a3cba36

Merge branch '4116' of https://github.com/StephenHogg/pymc3 into 4116

5d14410

ricardoV94 mentioned this pull request Nov 26, 2020

Improve model debugging #4205

Closed

Stephen Hogg added 3 commits November 27, 2020 08:53

fix test_examples.py

c9c40ff

fix test_step.py

2ec02a3

remove unnecessary import

ddf9fc3

twiecki merged commit 22c079c into pymc-devs:master Nov 27, 2020

StephenHogg mentioned this pull request Nov 27, 2020

Test model logp before starting any MCMC chains #4116

Closed

eigenfoo mentioned this pull request Nov 27, 2020

Remove unreachable lines in sampling code; test model logp before sampling eigenfoo/littlemcmc#101

Open

Spaak mentioned this pull request Nov 30, 2020

Silent integer overflow in test_values #4279

Closed

ricardoV94 added a commit to ricardoV94/pymc that referenced this pull request Dec 2, 2020

- Fix regression caused by pymc-devs#4211

8835104

ricardoV94 mentioned this pull request Dec 2, 2020

- Fix regression caused by #4211 #4285

Merged

eigenfoo mentioned this pull request Dec 6, 2020

Fix regression caused by pymc-devs/pymc3#4285 eigenfoo/littlemcmc#104

Open

michaelosthege mentioned this pull request Dec 7, 2020

bumping version to 3.10.0 #4309

Merged

Test model logp before starting any MCMC chains #4211

Test model logp before starting any MCMC chains #4211

Conversation

StephenHogg commented Nov 9, 2020

michaelosthege commented Nov 9, 2020

MarcoGorelli commented Nov 9, 2020 • edited Loading

StephenHogg commented Nov 10, 2020

michaelosthege Nov 10, 2020

Choose a reason for hiding this comment

StephenHogg commented Nov 10, 2020 via email

michaelosthege commented Nov 10, 2020

StephenHogg commented Nov 10, 2020

ColCarroll left a comment

Choose a reason for hiding this comment

ColCarroll commented Nov 10, 2020

StephenHogg commented Nov 10, 2020

michaelosthege commented Nov 11, 2020

StephenHogg commented Nov 11, 2020

michaelosthege commented Nov 11, 2020

ColCarroll commented Nov 11, 2020

StephenHogg commented Nov 11, 2020

michaelosthege commented Nov 11, 2020

StephenHogg commented Nov 14, 2020

StephenHogg commented Nov 14, 2020 • edited Loading

StephenHogg commented Nov 14, 2020

ColCarroll left a comment

Choose a reason for hiding this comment

StephenHogg commented Nov 15, 2020 • edited Loading

StephenHogg commented Nov 17, 2020 • edited Loading

ColCarroll commented Nov 17, 2020

codecov bot commented Nov 19, 2020 • edited Loading

Codecov Report

twiecki commented Nov 25, 2020

StephenHogg commented Nov 25, 2020

MarcoGorelli commented Nov 25, 2020 • edited Loading

StephenHogg commented Nov 25, 2020

ColCarroll commented Nov 26, 2020

StephenHogg commented Nov 26, 2020

MarcoGorelli commented Nov 27, 2020

StephenHogg commented Nov 27, 2020

StephenHogg commented Nov 27, 2020

twiecki commented Nov 27, 2020

StephenHogg commented Nov 27, 2020

ColCarroll commented Nov 28, 2020

MarcoGorelli commented Nov 9, 2020 •

edited

Loading

StephenHogg commented Nov 14, 2020 •

edited

Loading

StephenHogg commented Nov 15, 2020 •

edited

Loading

StephenHogg commented Nov 17, 2020 •

edited

Loading

codecov bot commented Nov 19, 2020 •

edited

Loading

MarcoGorelli commented Nov 25, 2020 •

edited

Loading