scipy.stats.gaussian_kde contour plots #50

williamjameshandley · 2019-08-10T22:11:41Z

Description

This is a relatively large addition, offering an alternative kde to fastkde using scipy.stats.gaussian_kde
fastkde is therefore now an optional requirement, with CI updated accordingly).
It is a necessary step in order to implement boundary corrections Boundary corrections #35

The main innovation here is to extensively use matplotlib's triangulation functionality to first dynamically bin, with bins defined by a random subsample. This means that you put bins where they are needed, and typically only 1000 points are required. Triangulation is also used to plot the contours, so once again only 1000 samples are required. This means that despite the fact that gaussian_kde is slower than fastkde, it is still snappy enough to be used dynamically for large plots.

The final result is reasonable:

Although it clearly needs the boundary data correction to remove the edge effects.

One additional change that is mixed in here is to revert ax.scatter to ax.plot in scatter_plot_2d, since I have observed scatter to make axis limits misbehave in non-trivial ways when applied to practical examples where data have very different scales. This has been documented and will be fixed in matplotlib v3.2, but until that is merged, we should revert to ax.plot. Whilst ax.scatter is more consistent with the name of the function, moving away from that actually neatens up a lot of the colouring issues that we have seen before (#32 #21 #45 #19 #31)

Fixes #4
Replaces #25

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works

codecov · 2019-08-10T22:44:22Z

Codecov Report

Merging #50 into master will decrease coverage by 6.06%.
The diff coverage is 87.5%.

@@            Coverage Diff            @@
##           master     #50      +/-   ##
=========================================
- Coverage   92.67%   86.6%   -6.07%     
=========================================
  Files          14      14              
  Lines        1078    1187     +109     
=========================================
+ Hits          999    1028      +29     
- Misses         79     159      +80

Impacted Files	Coverage Δ
anesthetic/fastkde.py	`100% <100%> (ø)`
anesthetic/samples.py	`94.7% <77.77%> (-2.62%)`	⬇️
anesthetic/plot.py	`86.07% <86.15%> (-5.5%)`	⬇️
anesthetic/utils.py	`93.52% <90.38%> (-1.39%)`	⬇️
anesthetic/read/montepythonreader.py	`38.15% <0%> (-61.85%)`	⬇️
anesthetic/read/samplereader.py	`75% <0%> (-4.17%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd7bcec...ad70036. Read the comment docs.

codecov · 2019-08-10T23:14:47Z

Codecov Report

Merging #50 into master will increase coverage by 0.05%.
The diff coverage is 94.36%.

@@            Coverage Diff            @@
##           master     #50      +/-   ##
=========================================
+ Coverage   92.65%   92.7%   +0.05%     
=========================================
  Files          14      14              
  Lines        1075    1192     +117     
=========================================
+ Hits          996    1105     +109     
- Misses         79      87       +8

Impacted Files	Coverage Δ
anesthetic/samples.py	`97.39% <100%> (+0.06%)`	⬆️
anesthetic/kde.py	`100% <100%> (ø)`	⬆️
anesthetic/utils.py	`93.71% <91.22%> (-1.21%)`	⬇️
anesthetic/plot.py	`92.01% <95.38%> (+0.53%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64a4bd8...9493acb. Read the comment docs.

lukashergt

Renaming kde.py to fastkde.py causes an import error even when fastkde is installed because of overloading the term fastkde.

williamjameshandley · 2019-08-11T15:47:27Z

Renaming kde.py to fastkde.py causes an import error even when fastkde is installed because of overloading the term fastkde.

Well spotted. I think I should have fixed this now. There weren't actually that many places that anesthetic.fastkde was being imported, so it's a relatively easy change.

anesthetic/plot.py

… into scipy_kde

lukashergt · 2019-08-11T16:30:30Z

anesthetic/utils.py

+    if sum(w != 0) < n:
+        i = numpy.arange(len(w))
+    else:
+        i = numpy.random.choice(len(w), size=n, replace=False, p=w/w.sum())


Should replace really be False here?
How does that affect the weighting?

Excellent question. It all depends on what you do with the weights after. For concreteness, consider this code:

x = numpy.array([1,2,3]) w = numpy.array([1e10,1,1]) print(numpy.random.choice(x, size=2, replace=False, p = w/w.sum()))

You are right that replace=False would definitely be wrong if you were using it to 'de-weight' a set of samples (as in the MCMCSamples.compress) function). However, if you keep the weights, then this acts as an alternative form of compression . You can actually think of the strategy performed in MCMCSamples.compress as doing this, but with the optimal size chosen by the channel capacity.

In actual fact, since we are only using this to bin the weights into ~O(1000) samples, it doesn't even matter that i is chosen as a compression strategy. For example, if you wanted to ensure better coverage in the tails, the p in the above line need not be proportional to the weights. Points chosen in the tails would aggregate to have very low weight, but that may be what you want in practice if you wanted to go beyond two-sigma contours.

anesthetic/utils.py

anesthetic/samples.py

anesthetic/plot.py

anesthetic/utils.py

williamjameshandley added 3 commits August 10, 2019 22:49

Implemented gaussian_kde as a tricontour plot

dbf55f6

Made fastkde optional

bbf5230

Typo correction

8ee6085

williamjameshandley mentioned this pull request Aug 10, 2019

Preliminary tricontour functionality #25

Closed

5 tasks

williamjameshandley requested a review from lukashergt August 10, 2019 22:17

williamjameshandley self-assigned this Aug 10, 2019

williamjameshandley added this to the 1.2 milestone Aug 10, 2019

williamjameshandley added the enhancement New feature or request label Aug 10, 2019

williamjameshandley added 3 commits August 10, 2019 23:21

Added fastkde to optional dependencies in setup.py

e90068e

Updated tests for minimal requirements

25c30c0

Merge branch 'master' into scipy_kde

ad70036

Removed incompatible dot product for python 2

308bb62

handley-lab deleted a comment from codecov bot Aug 10, 2019

Further python2 corrections

c82b03e

lukashergt requested changes Aug 11, 2019

View reviewed changes

Renamed faskde.py to kde.py and adjusted imports

29ada85

update error message to include fastkde option

e3e4d93

lukashergt previously approved these changes Aug 11, 2019

View reviewed changes

Merge branch 'master' into scipy_kde

a0c8daf

lukashergt reviewed Aug 11, 2019

View reviewed changes

anesthetic/plot.py Outdated Show resolved Hide resolved

williamjameshandley added 2 commits August 11, 2019 17:14

Changed n to ncompress

2510471

Merge branch 'scipy_kde' of github.com:williamjameshandley/anesthetic…

96582a5

… into scipy_kde

williamjameshandley dismissed lukashergt’s stale review via 96582a5 August 11, 2019 16:15

lukashergt reviewed Aug 11, 2019

View reviewed changes

anesthetic/utils.py Outdated Show resolved Hide resolved

lukashergt reviewed Aug 11, 2019

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

Added ncompress as a kwarg argument to plots

186427a

williamjameshandley added 2 commits August 12, 2019 20:15

Converted to zscoring

2999ef1

Changed to log s

63ef149

lukashergt reviewed Aug 12, 2019

View reviewed changes

anesthetic/plot.py Outdated Show resolved Hide resolved

Changed ncompress default behaviour of gaussian kde functions

9493acb

lukashergt reviewed Aug 13, 2019

View reviewed changes

anesthetic/utils.py Show resolved Hide resolved

lukashergt approved these changes Aug 13, 2019

View reviewed changes

williamjameshandley merged commit a385fcc into master Aug 13, 2019

williamjameshandley deleted the scipy_kde branch August 13, 2019 13:45

williamjameshandley mentioned this pull request Dec 14, 2019

Removed fastkde from requirements.txt #66

Merged

5 tasks

lukashergt mentioned this pull request Jan 17, 2023

ax.scatter or ax.plot in scatter_plot_2d? #252

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scipy.stats.gaussian_kde contour plots #50

scipy.stats.gaussian_kde contour plots #50

williamjameshandley commented Aug 10, 2019 •

edited

Loading

codecov bot commented Aug 10, 2019

codecov bot commented Aug 10, 2019 •

edited

Loading

lukashergt left a comment

williamjameshandley commented Aug 11, 2019

lukashergt Aug 11, 2019

williamjameshandley Aug 12, 2019

scipy.stats.gaussian_kde contour plots #50

scipy.stats.gaussian_kde contour plots #50

Conversation

williamjameshandley commented Aug 10, 2019 • edited Loading

Description

Checklist:

codecov bot commented Aug 10, 2019

Codecov Report

codecov bot commented Aug 10, 2019 • edited Loading

Codecov Report

lukashergt left a comment

Choose a reason for hiding this comment

williamjameshandley commented Aug 11, 2019

lukashergt Aug 11, 2019

Choose a reason for hiding this comment

williamjameshandley Aug 12, 2019

Choose a reason for hiding this comment

williamjameshandley commented Aug 10, 2019 •

edited

Loading

codecov bot commented Aug 10, 2019 •

edited

Loading