Fixed Posterior plot errors with boolean array. #1707

utkarsh-maheshwari · 2021-05-27T10:07:51Z

Description

Plotted bar instead of hist for a boolean value.
Also, should HDI also be visible? I have assumed no.

Checklist

Follows official PR format
Includes a sample plot to visually illustrate the changes (only for plot-related functions)
New features are properly documented (with an example if appropriate)?
Includes new or updated tests to cover the new feature
Code style correct (follows pylint and black guidelines)
Changes are listed in changelog

OriolAbril

I think we could use if dtype kind == "f" -> elif dtype kind == "i" else (dtype == "b") instead of converting to int and storing is_bool. We may also want to update a bit more the apperance of the plot? @sethaxen? i.e. showing the y axis in the boolean case, or hiding the mean and annotating both bars with count (xx%) on top of them, things like that

utkarsh-maheshwari · 2021-05-27T16:51:36Z

Yes, it looks incomplete without the y axis. Don't we need it in the case of integers?

Achieved this.

through this:

import numpy as np
import arviz as az

data = np.random.choice(a=[False, True], size=(4, 100))
az.plot_posterior(data)

utkarsh-maheshwari · 2021-05-27T18:02:32Z

@OriolAbril Can the dtype of values be other than f, i or b?

sethaxen · 2021-05-27T20:48:30Z

@OriolAbril Can the dtype of values be other than f, i or b?

That's a good question. In principle a PPL could produces samples of strings or objects, though all I am aware of currently would require first assigning integers to the strings/objects and then produce integer samplers. Do we anywhere in arviz assert that the dtype is amount these types?

We may also want to update a bit more the apperance of the plot? @sethaxen? i.e. showing the y axis in the boolean case, or hiding the mean and annotating both bars with count (xx%) on top of them, things like that

Yes, I think it makes sense to label the bars with count and percent instead mean, and if this is done, an axis is not necessary.
I like the gap between the bins.

ahartikainen · 2021-05-28T05:39:21Z

True/False is an edge case from category dtype with 2 categories.

If we have a flag for category, then we could do this kind of (bar) plot for multiple categories too.

utkarsh-maheshwari · 2021-05-28T10:26:22Z

If we have a flag for category, then we could do this kind of (bar) plot for multiple categories too.

So, does that mean
if dtype == " f " -> kde
elif dtype == " i " -> hist
else -> bar

OriolAbril · 2021-06-01T18:13:57Z

So, does that mean
if dtype == " f " -> kde
elif dtype == " i " -> hist
else -> bar

Sorry about the delay, I think this is a good starting point, we can then improve/refactor the else case to handle the category case too, but before that we have to decide on an api, have this info in inferencedata or as an argument or both which will probably take some time.

codecov · 2021-06-02T11:20:04Z

Codecov Report

Merging #1707 (f467530) into main (479f2a7) will decrease coverage by 11.25%.
The diff coverage is 100.00%.

❗ Current head f467530 differs from pull request most recent head b799812. Consider uploading reports for the commit b799812 to get more accurate results

@@             Coverage Diff             @@
##             main    #1707       +/-   ##
===========================================
- Coverage   90.81%   79.55%   -11.26%     
===========================================
  Files         114      114               
  Lines       12332    12348       +16     
===========================================
- Hits        11199     9824     -1375     
- Misses       1133     2524     +1391

Impacted Files	Coverage Δ
arviz/plots/backends/bokeh/posteriorplot.py	`98.34% <100.00%> (+0.14%)`	⬆️
arviz/plots/backends/matplotlib/posteriorplot.py	`98.36% <100.00%> (+0.08%)`	⬆️
arviz/data/io_emcee.py	`12.39% <0.00%> (-85.96%)`	⬇️
arviz/data/io_pystan.py	`14.12% <0.00%> (-81.98%)`	⬇️
arviz/data/io_cmdstanpy.py	`15.62% <0.00%> (-78.13%)`	⬇️
arviz/data/io_pymc3_3x.py	`18.18% <0.00%> (-72.06%)`	⬇️
arviz/data/io_pyro.py	`27.81% <0.00%> (-68.43%)`	⬇️
arviz/data/io_numpyro.py	`26.71% <0.00%> (-67.18%)`	⬇️
arviz/data/io_cmdstan.py	`48.48% <0.00%> (-41.13%)`	⬇️
arviz/data/io_pymc3.py	`66.66% <0.00%> (-6.67%)`	⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 479f2a7...b799812. Read the comment docs.

utkarsh-maheshwari · 2021-06-02T16:45:25Z

Added lines #L332 - L336 were not covered by tests
How to resolve this error?

ahartikainen · 2021-06-03T05:36:10Z

We need to add test that has boolean data. (check current tests and see if you can add boolean data as input for the test for this plot)

utkarsh-maheshwari · 2021-06-04T11:14:09Z

I think I need to create a function test_plot_posterior_boolean below

arviz/arviz/tests/base_tests/test_plots_matplotlib.py

Line 946 in 4d9caca

def test_plot_posterior(models, kwargs):

.
to test with a boolean data. Right?

ahartikainen · 2021-06-04T12:58:04Z

yes

utkarsh-maheshwari · 2021-06-04T18:18:44Z

I think now there are some errors not related to my code changes.

arviz/plots/backends/matplotlib/posteriorplot.py

arviz/tests/base_tests/test_plots_matplotlib.py

OriolAbril · 2021-06-12T15:02:25Z

arviz/plots/backends/matplotlib/posteriorplot.py

@@ -319,18 +319,23 @@ def format_axes():
            rug=False,
            show=False,
        )
-    else:
+    elif values.dtype.kind == "i":


the initial condition is if kind=kde and dtype=f so this condition is too restrictive, I think it should be if dtype=i or (dtype=f and kind=hist)

Yea right. I should have done it that way. Thanks for pointing.

test exists because we all do this which is also why writing tests is important everywhere

Done.
One more observation. Now, that there is an added kind of plot bar for boolean values. So maybe we should set
kind = 'bar' somewhere and should accept kind = 'bar'.

utkarsh-maheshwari · 2021-06-12T17:43:37Z

I have no idea why tests are failing. It works fine when I write the same test in notebook.
Is there a better way to debug tests?

OriolAbril · 2021-06-12T21:25:05Z

Is there a better way to debug tests?

Were tests passing locally? Whenever I get stuck on one or a couple tests I run them locally, and only those so I can iterate fast. pytest has a lot of features to select which subset of tests to run: https://docs.pytest.org/en/6.2.x/reference.html#command-line-flags. To begin with you have to call pytest dir_or_file where dir_or_file can be a path to a folder, a path to a file or a glob pattern, you can also use special syntax after that with colons to select which class/test function to run but I find that very complicated and always use the -k flag to filter tests that fulfill a condition, in your case it could be something like -k "posterior and boolean" to run only the specific test you added

ahartikainen · 2021-06-14T07:55:16Z

arviz/tests/base_tests/test_plots_matplotlib.py

+def test_plot_posterior_boolean():
+    data = np.random.choice(a=[False, True], size=(4, 100))
+    axes = plot_posterior(data)
+    plt.draw()


Suggested change

plt.draw()

I think this is not needed for tests (or is it?)

Answer to this question may look weird at first but
On notebook, assert axes.get_xticklabels() runs fine when run in next cell but returns empty list when run on the same cell containing plt.plot().
Found a solution here

(My earlier comment got duplicated. When I deleted one, both got deleted 🤦🏼 )

Ok, then this can be removed (we don't run tests in notebooks)

Tests were failing when I removed this because then get_xticklabels() returns empty list.

Interactive users when want to run axes.get_xticklabels() can do this by running it in different cell

But non interactive users have to call plt.draw() before using axes.get_xticklabels()

More clear explanation :

in interactive mode in ipython, a draw() is automatically triggered after each function that has a plotting action. In non-interactive environments, the draw action is executed only at the end--this saves a lot of run time.

But maybe, we can call plt.draw() inside our posterior plot function and remove it from here.

Yes, tests are really run from command line

python -m pytest path/to/tests

So keeping them clean is ok.

utkarsh-maheshwari · 2021-06-14T16:37:49Z

@OriolAbril This is much better approach. Thanks!

arviz/plots/backends/matplotlib/posteriorplot.py

utkarsh-maheshwari · 2021-07-27T18:48:30Z

Unable to understand why a codecov shows a lot of code becomes uncovered through these changes. It shows whole matplotlib posterior plot function is becomes uncovered!

OriolAbril · 2021-07-27T22:47:29Z

Unable to understand why a codecov shows a lot of code becomes uncovered through these changes. It shows whole matplotlib posterior plot function is becomes uncovered!

If the azure job fails, nothing is uploaded to codecov, thus if all jobs fail, the coverage is 0

utkarsh-maheshwari · 2021-07-29T19:53:40Z

arviz/tests/base_tests/test_plots_matplotlib.py

+    data = np.random.choice(a=[False, True], size=(4, 100))
+    axes = plot_posterior(data)
+    assert axes
+    assert axes.get_xticklabels()[0].get_text() != 0


I know this is not the best way to see if xticklabels are correct. It should be like assert axes.get_xticklabels()[0].get_text() == 'False'.
When I tried it locally, tests work well only if I provide show=True to the plot function call but shows an assertion error when show=False (default).
AssertionError: assert ' ' == 'False'
Azure jobs fails with same error even if show=True is pushed.

maybe it could be solved using some non-interactive/directly writing to file backend? I don't reallly know what is happening, but it would not be surprising that Azure had some extra checks preventing guis, pop ups or other interactive objects

Already tried using plt.draw() in the matplotlib backend file, didn't work! I guess we need to call plt.draw() in the test function before assert or maybe leave it like that if its an "okayish" way to test if not good.

ahartikainen · 2022-03-23T10:55:37Z

Thanks for the fix

OriolAbril reviewed May 27, 2021

View reviewed changes

utkarsh-maheshwari changed the title ~~Fixed Posterior plot errors with boolean array. Addresses issue #1694~~ Fixed Posterior plot errors with boolean array. May 29, 2021

utkarsh-maheshwari force-pushed the posterior branch 5 times, most recently from fda2fae to 99aa489 Compare June 2, 2021 10:52

utkarsh-maheshwari force-pushed the posterior branch 2 times, most recently from 94dcae8 to 03bbf34 Compare June 4, 2021 16:31

utkarsh-maheshwari force-pushed the posterior branch from 03bbf34 to ac335d2 Compare June 11, 2021 18:41

OriolAbril reviewed Jun 11, 2021

View reviewed changes

arviz/plots/backends/matplotlib/posteriorplot.py Show resolved Hide resolved

arviz/tests/base_tests/test_plots_matplotlib.py Show resolved Hide resolved

utkarsh-maheshwari force-pushed the posterior branch from c465126 to ab02a78 Compare June 11, 2021 21:05

OriolAbril reviewed Jun 12, 2021

View reviewed changes

utkarsh-maheshwari force-pushed the posterior branch from ab02a78 to ed6f1c1 Compare June 12, 2021 16:50

ahartikainen approved these changes Jun 14, 2021

View reviewed changes

ahartikainen requested changes Jun 14, 2021

View reviewed changes

utkarsh-maheshwari commented Jun 14, 2021

View reviewed changes

arviz/plots/backends/matplotlib/posteriorplot.py Show resolved Hide resolved

utkarsh-maheshwari force-pushed the posterior branch from c12fefa to 5db0960 Compare July 3, 2021 16:49

Fixed Posterior plot errors with boolean array for matplotlib

1db9b3c

utkarsh-maheshwari force-pushed the posterior branch from 5db0960 to 1db9b3c Compare July 27, 2021 17:45

remove plt.draw()

bf9141c

utkarsh-maheshwari force-pushed the posterior branch from 5767d59 to bf9141c Compare July 28, 2021 17:31

utkarsh-maheshwari commented Jul 29, 2021

View reviewed changes

ahartikainen added 6 commits March 23, 2022 10:00

Update test

4396f78

Add a bad type test

6fe402e

Fix whitespace

6791df7

Update bokeh code

e8d8e49

Update test_plots_bokeh.py

e9ab685

Update CHANGELOG.md

b75bb3f

ahartikainen approved these changes Mar 23, 2022

View reviewed changes

ahartikainen added 3 commits March 23, 2022 10:35

Merge branch 'main' into posterior

65521ec

Update CHANGELOG.md

7afcde1

Fix lint

b799812

ahartikainen merged commit aeb25cb into arviz-devs:main Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed Posterior plot errors with boolean array. #1707

Fixed Posterior plot errors with boolean array. #1707

utkarsh-maheshwari commented May 27, 2021 •

edited

Loading

OriolAbril left a comment

utkarsh-maheshwari commented May 27, 2021

utkarsh-maheshwari commented May 27, 2021

sethaxen commented May 27, 2021

ahartikainen commented May 28, 2021

utkarsh-maheshwari commented May 28, 2021

OriolAbril commented Jun 1, 2021

codecov bot commented Jun 2, 2021 •

edited

Loading

utkarsh-maheshwari commented Jun 2, 2021

ahartikainen commented Jun 3, 2021

utkarsh-maheshwari commented Jun 4, 2021

ahartikainen commented Jun 4, 2021

utkarsh-maheshwari commented Jun 4, 2021

OriolAbril Jun 12, 2021

utkarsh-maheshwari Jun 12, 2021

OriolAbril Jun 12, 2021

utkarsh-maheshwari Jun 12, 2021

utkarsh-maheshwari commented Jun 12, 2021 •

edited

Loading

OriolAbril commented Jun 12, 2021

ahartikainen Jun 14, 2021

ahartikainen Jun 14, 2021

utkarsh-maheshwari Jun 14, 2021

ahartikainen Jun 14, 2021

utkarsh-maheshwari Jun 14, 2021

ahartikainen Jun 14, 2021

utkarsh-maheshwari commented Jun 14, 2021

utkarsh-maheshwari commented Jul 27, 2021

OriolAbril commented Jul 27, 2021

utkarsh-maheshwari Jul 29, 2021 •

edited

Loading

OriolAbril Jul 29, 2021

utkarsh-maheshwari Aug 10, 2021

ahartikainen commented Mar 23, 2022

Fixed Posterior plot errors with boolean array. #1707

Fixed Posterior plot errors with boolean array. #1707

Conversation

utkarsh-maheshwari commented May 27, 2021 • edited Loading

Description

Checklist

OriolAbril left a comment

Choose a reason for hiding this comment

utkarsh-maheshwari commented May 27, 2021

utkarsh-maheshwari commented May 27, 2021

sethaxen commented May 27, 2021

ahartikainen commented May 28, 2021

utkarsh-maheshwari commented May 28, 2021

OriolAbril commented Jun 1, 2021

codecov bot commented Jun 2, 2021 • edited Loading

Codecov Report

utkarsh-maheshwari commented Jun 2, 2021

ahartikainen commented Jun 3, 2021

utkarsh-maheshwari commented Jun 4, 2021

ahartikainen commented Jun 4, 2021

utkarsh-maheshwari commented Jun 4, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

utkarsh-maheshwari commented Jun 12, 2021 • edited Loading

OriolAbril commented Jun 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

utkarsh-maheshwari commented Jun 14, 2021

utkarsh-maheshwari commented Jul 27, 2021

OriolAbril commented Jul 27, 2021

utkarsh-maheshwari Jul 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahartikainen commented Mar 23, 2022

utkarsh-maheshwari commented May 27, 2021 •

edited

Loading

codecov bot commented Jun 2, 2021 •

edited

Loading

utkarsh-maheshwari commented Jun 12, 2021 •

edited

Loading

utkarsh-maheshwari Jul 29, 2021 •

edited

Loading