Add more info to divergence warnings #3990

aseyboldt · 2020-07-01T18:15:01Z

This is a patch to store some additional information in the internal warnings about divergences. I used this a couple of times for debugging, and I think it might be of some use for other people as well.
Before, we only stored where the step that diverged started. This patch also adds what the energy, logp, gradient and velocity were where it started, and where it ended up.
I also converted the report.SamplerWarning to a dataclass to make it easier to extend. This requires >=py3.7.

This makes it possible to make interesting plots of where divergences happend:

import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np

with pm.Model() as model:
    sd = pm.HalfNormal('sd')
    pm.Normal('x', sd=sd)
    
    trace = pm.sample(Emax=1000)

x = 'sd_log__'
y = 'x'
plt.scatter(trace[x], trace[y], marker='.')
for warn in trace.report._warnings:
    if warn.divergence_point_dest is not None:
        source = warn.divergence_point_source
        dest = warn.divergence_point_dest
        source = np.array([source[x], source[y]])
        dest = np.array([dest[x], dest[y]])
        plt.arrow(*source, *(dest - source) / 100)

(Maybe Emax=1000 is a bit high as a default by the way?)

AlexAndorra · 2020-07-01T19:25:34Z

Loving this @aseyboldt !! Please ping me when it's ready for review 😉

aseyboldt · 2020-07-02T18:06:15Z

@AlexAndorra @ColCarroll This is ready for review now.
I included an extra dependency for dataclasses so that tests pass on py36. We can get rid of that when we drop py36.
I didn't include it in the release notes yet, because I think we need to figure out if/how we want to use this for things except debugging.
I think it would be nice if arviz could make use of this to show more precise info about divergences at some point.

AlexAndorra

Thanks @aseyboldt, LGTM! I don't know much about dataclasses but that looks neat.
I think this should definitely be used by ArviZ down the road -- probably in plot_pair? (cc-ing @OriolAbril about that).
In the meantime, I think it makes sense to mention this in the release notes, so that people can use it for manual debugging, as you did in your example.
Waiting for @ColCarroll's approval to merge 😉

OriolAbril · 2020-07-03T10:26:35Z

Hey, are we planning on following NEP 29? If so there is no need to keep the changes compatible with 3.6 as next release should drop 3.6 support https://twitter.com/Mbussonn/status/1272667933332758528?s=20

Plot looks great, thanks @aseyboldt, is this covered on the divergences notebook? Maybe it would be useful to have a comparison and guidance on where this info can have an edge wrt "regular" divergences

AlexAndorra · 2020-07-03T10:57:34Z

Adrian did write a PR yesterday to drop py36, but we don't know yet when we'll merge it (just added that point to today's lab meeting 😉 ). Let's wait for that decision then before merging.

is this covered on the divergences notebook? Maybe it would be useful to have a comparison and guidance on where this info can have an edge wrt "regular" divergences

Are you talking about the "Diagnosing divergences" NB? @rpgoldman is updating it, so maybe Adrian should wait until this PR is merged to add the example? Very good idea though: would be super helpful to show a comparison and explain why / when this plot would be useful 👌

aseyboldt · 2020-07-03T17:11:25Z

I wouldn't know what to write in the notebook for now to tell you the truth. I just don't know if, why or when this plot would be useful.
The reason I would still like to have this info in the warnings isn't so much that it allows us to make that plot, but because it has turned out to be useful for debugging the bug where the step size was nan, and also a couple of issues in sunode. I think it might turn out to be interesting for diagnosing divergences in addition to that, but I don't know.

rpgoldman · 2020-07-03T17:31:21Z

@aseyboldt I'd be happy to discuss this with you and see about getting it into the revised notebook.

AlexAndorra · 2020-07-03T18:08:07Z

Makes sense @aseyboldt. We decided to drop py36 in the lab meeting, so we can do that:

I'll merge your PR on py36
You can rebase this branch and remove the dataclasses requirement from this PR
I'll merge once everything is green
Sounds good?

AlexAndorra

LGTM, thanks Adrian!

AlexAndorra · 2020-07-06T15:05:46Z

@aseyboldt, since we're finally not dropping py36 (#4003), does this need an extra dependency for dataclasses so that tests pass on py36?
If yes, I think I can revert the merge here, so that you can continue working from this branch (essentially reverting your last commit)? Sorry for all the back and forth...

junpenglao · 2020-07-06T15:19:20Z

@AlexAndorra I already added the new dependency in the PR.

AlexAndorra · 2020-07-06T15:21:09Z

Ah true, didn't see that, thanks @junpenglao !

* Update GP NBs to use standard notebook style (pymc-devs#3978) * update gp-latent nb to use arviz * rerun, run black * rerun after fixes from comments * rerun black * rewrite radon notebook using ArviZ and xarray (pymc-devs#3963) * rewrite radon notebook using ArviZ and xarray Roughly half notebook has been updated * add comments on xarray usage * rewrite 2n half of notebook * minor fix * rerun notebook and minor changes * rerun notebook on pymc3.9.2 and ArviZ 0.9.0 * remove unused import * add change to release notes * SMC: refactor, speed-up and run multiple chains in parallel for diagnostics (pymc-devs#3981) * first attempt to vectorize smc kernel * add ess, remove multiprocessing * run multiple chains * remove unused imports * add more info to report * minor fix * test log * fix type_num error * remove unused imports update BF notebook * update notebook with diagnostics * update notebooks * update notebook * update notebook * Honor discard_tuned_samples during KeyboardInterrupt (pymc-devs#3785) * Honor discard_tuned_samples during KeyboardInterrupt * Do not compute convergence checks without samples * Add time values as sampler stats for NUTS (pymc-devs#3986) * Add time values as sampler stats for NUTS * Use float time counters for nuts stats * Add timing sampler stats to release notes * Improve doc of time related sampler stats Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> * Drop support for py3.6 (pymc-devs#3992) * Drop support for py3.6 * Update RELEASE-NOTES.md Co-authored-by: Colin <ColCarroll@users.noreply.github.com> Co-authored-by: Colin <ColCarroll@users.noreply.github.com> * Fix Mixture distribution mode computation and logp dimensions Closes pymc-devs#3994. * Add more info to divergence warnings (pymc-devs#3990) * Add more info to divergence warnings * Add dataclasses as requirement for py3.6 * Fix tests for extra divergence info * Remove py3.6 requirements * follow-up of py36 drop (pymc-devs#3998) * Revert "Drop support for py3.6 (pymc-devs#3992)" This reverts commit 1bf867e. * Update README.rst * Update setup.py * Update requirements.txt * Update requirements.txt Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com> * Show pickling issues in notebook on windows (pymc-devs#3991) * Merge close remote connection * Manually pickle step method in multiprocess sampling * Fix tests for extra divergence info * Add test for remote process crash * Better formatting in test_parallel_sampling Co-authored-by: Junpeng Lao <junpenglao@gmail.com> * Use mp_ctx forkserver on MacOS * Add test for pickle with dill Co-authored-by: Junpeng Lao <junpenglao@gmail.com> * Fix keep_size for arviz structures. (pymc-devs#4006) * Fix posterior pred. sampling keep_size w/ arviz input. Previously posterior predictive sampling functions did not properly handle the `keep_size` keyword argument when getting an xarray Dataset as parameter. Also extended these functions to accept InferenceData object as input. * Reformatting. * Check type errors. Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests. * Add changelog entry. Co-authored-by: Robert P. Goldman <rpgoldman@sift.net> * SMC-ABC add distance, refactor and update notebook (pymc-devs#3996) * update notebook * move dist functions out of simulator class * fix docstring * add warning and test for automatic selection of sort sum_stat when using wassertein and energy distances * update release notes * fix typo * add sim_data test * update and add tests * update and add tests * add docs for interpretation of length scales in periodic kernel (pymc-devs#3989) * fix the expression of periodic kernel * revert change and add doc * FIXUP: add suggested doc string * FIXUP: revertchanges in .gitignore * Fix Matplotlib type error for tests (pymc-devs#4023) * Fix for issue 4022. Check for support for `warn` argument in `matplotlib.use()` call. Drop it if it causes an error. * Alternative fix. * Switch from pm.DensityDist to pm.Potential to describe the likelihood in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook. * Remove Dirichlet distribution type restrictions (pymc-devs#4000) * Remove Dirichlet distribution type restrictions Closes pymc-devs#3999. * Add missing Dirichlet shape parameters to tests * Remove Dirichlet positive concentration parameter constructor tests This test can't be performed in the constructor if we're allowing Theano-type distribution parameters. * Add a hack to statically infer Dirichlet argument shapes Co-authored-by: Brandon T. Willard <brandonwillard@users.noreply.github.com> Co-authored-by: Bill Engels <w.j.engels@gmail.com> Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com> Co-authored-by: Osvaldo Martin <aloctavodia@gmail.com> Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com> Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> Co-authored-by: Colin <ColCarroll@users.noreply.github.com> Co-authored-by: Brandon T. Willard <brandonwillard@users.noreply.github.com> Co-authored-by: Junpeng Lao <junpenglao@gmail.com> Co-authored-by: rpgoldman <rpgoldman@goldman-tribe.org> Co-authored-by: Robert P. Goldman <rpgoldman@sift.net> Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com> Co-authored-by: Brandon T. Willard <971601+brandonwillard@users.noreply.github.com>

aseyboldt force-pushed the more-div-info branch from dbc2196 to 8fa8318 Compare July 1, 2020 18:17

aseyboldt added WIP enhancements labels Jul 1, 2020

aseyboldt added 2 commits July 2, 2020 17:20

Add more info to divergence warnings

dcee437

Add dataclasses as requirement for py3.6

92e0dd2

aseyboldt force-pushed the more-div-info branch from 8fa8318 to 92e0dd2 Compare July 2, 2020 15:22

Fix tests for extra divergence info

d3a5b92

aseyboldt marked this pull request as ready for review July 2, 2020 18:00

aseyboldt removed the WIP label Jul 3, 2020

AlexAndorra approved these changes Jul 3, 2020

View reviewed changes

AlexAndorra added the don't merge label Jul 3, 2020

Remove py3.6 requirements

17f4ec8

AlexAndorra approved these changes Jul 5, 2020

View reviewed changes

AlexAndorra merged commit a34f63a into pymc-devs:master Jul 5, 2020

aseyboldt deleted the more-div-info branch July 5, 2020 11:02

AlexAndorra removed the don't merge label Jul 6, 2020

eigenfoo mentioned this pull request Jul 6, 2020

Add more details to divergence warnings eigenfoo/littlemcmc#77

Open

kyleabeauchamp added this to the 3.9.3 milestone Jul 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more info to divergence warnings #3990

Add more info to divergence warnings #3990

aseyboldt commented Jul 1, 2020

AlexAndorra commented Jul 1, 2020

aseyboldt commented Jul 2, 2020

AlexAndorra left a comment •

edited

Loading

OriolAbril commented Jul 3, 2020

AlexAndorra commented Jul 3, 2020

aseyboldt commented Jul 3, 2020

rpgoldman commented Jul 3, 2020

AlexAndorra commented Jul 3, 2020

AlexAndorra left a comment

AlexAndorra commented Jul 6, 2020

junpenglao commented Jul 6, 2020

AlexAndorra commented Jul 6, 2020

Add more info to divergence warnings #3990

Add more info to divergence warnings #3990

Conversation

aseyboldt commented Jul 1, 2020

AlexAndorra commented Jul 1, 2020

aseyboldt commented Jul 2, 2020

AlexAndorra left a comment • edited Loading

Choose a reason for hiding this comment

OriolAbril commented Jul 3, 2020

AlexAndorra commented Jul 3, 2020

aseyboldt commented Jul 3, 2020

rpgoldman commented Jul 3, 2020

AlexAndorra commented Jul 3, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

AlexAndorra commented Jul 6, 2020

junpenglao commented Jul 6, 2020

AlexAndorra commented Jul 6, 2020

AlexAndorra left a comment •

edited

Loading