Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter feature for var_names #1154

Merged
merged 21 commits into from
Apr 27, 2020

Conversation

AlexAndorra
Copy link
Contributor

Description

  • Currently, users can select variables entering az.summary and other plotting/diagnostics functions with the var_names kwarg.
    This selection must however be exact, meaning that if you have variables with similar names in your trace, you have to specify them all: az.summary(idata, var_names=["alpha", "beta1", "beta2", "beta3"]), which can be cumbersome for big models.

  • This PR introduces the filter_like kwarg to allow users to select all variables containing the given word. For instance, az.summary(idata, var_names=["alpha", "beta"], filter_like=True) would give the same result as above. This also works to exclude variables with similar names: az.summary(idata, var_names=["~beta"], filter_like=True)

  • filter_like defaults to False to ensure backwards-compatibility.

Before adding these changes to the changelog, I had pending questions:

  1. Should we add filter_like to the other functionalities that use var_names, and not only to summary?
  2. Should we include new tests for this feature?
  • Bonus points: the new code may be too convoluted at times, so please tell me if you see a simpler, more pythonic way to do things.

I'm of course available for any changes, and thanks a lot in advances for your reviews 🖖

@codecov
Copy link

codecov bot commented Apr 20, 2020

Codecov Report

Merging #1154 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1154   +/-   ##
=======================================
  Coverage   93.11%   93.11%           
=======================================
  Files          94       94           
  Lines        9289     9302   +13     
=======================================
+ Hits         8649     8662   +13     
  Misses        640      640           
Impacted Files Coverage Δ
arviz/data/base.py 100.00% <ø> (ø)
arviz/plots/autocorrplot.py 100.00% <100.00%> (ø)
arviz/plots/essplot.py 100.00% <100.00%> (ø)
arviz/plots/forestplot.py 94.11% <100.00%> (ø)
arviz/plots/jointplot.py 97.50% <100.00%> (ø)
arviz/plots/mcseplot.py 100.00% <100.00%> (ø)
arviz/plots/pairplot.py 92.94% <100.00%> (ø)
arviz/plots/parallelplot.py 98.00% <100.00%> (ø)
arviz/plots/posteriorplot.py 89.74% <100.00%> (ø)
arviz/plots/ppcplot.py 95.29% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f46107...7cf81ae. Read the comment docs.

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! If possible, I think it would be even better to match using regular expressions (which should also work the same for most variable names).

Should we add filter_like to the other functionalities that use var_names, and not only to summary?

I would extend it to other functions, given that everything is handled by var_names, it should not be too much work.

Should we include new tests for this feature?

Definitely. There already are some tests on _var_names in test_plot_utils, they should be extended to cover new cases, and as it is already tested here, there would be no need to test the behaviour in each of the plots (maybe if some does special logic to handle var_names like plot_ppc)

arviz/stats/stats.py Outdated Show resolved Hide resolved
arviz/utils.py Outdated Show resolved Hide resolved
@AlexAndorra
Copy link
Contributor Author

Thanks for the review @OriolAbril !

  • I implemented the regex feature, extended the filter kwarg to all functions taking var_names as arg, and added tests for this new feature in test_utils.py.

  • I also updated the examples in the docstrings when useful and deleted the white space between args' names and the colon in docstrings (as discussed in another PR).

  • Here is how this works now: imagine that all_vars=["alpha", "beta1", "beta2", "p1", "p2", "phi"]. Users will now be able to:

    • Filter by partial naming: az.summary(idata, var_names=["alpha", "beta"], filter="like") --> ["alpha", "beta1", "beta2"]
    • az.summary(idata, var_names=["~beta"], filter="like") --> ["alpha", "p1", "p2", "phi"]
    • Filter by regular expressions: az.summary(idata, var_names=["^bet"], filter="regex") --> ["beta1", "beta2"]
    • az.summary(idata, var_names=["~p[0-9]+"], filter="regex") --> ["alpha", "beta1", "beta2", "phi"]

Tell me if you see other necessary changes, otherwise I hope this is ready to merge 🖖

Copy link
Contributor

@ahartikainen ahartikainen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

arviz/plots/autocorrplot.py Outdated Show resolved Hide resolved
arviz/utils.py Outdated Show resolved Hide resolved
arviz/utils.py Outdated Show resolved Hide resolved
arviz/plots/autocorrplot.py Outdated Show resolved Hide resolved
arviz/plots/essplot.py Outdated Show resolved Hide resolved
Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I hope I have not repeated myself too much with previous comments

arviz/plots/autocorrplot.py Outdated Show resolved Hide resolved
arviz/utils.py Outdated Show resolved Hide resolved
arviz/utils.py Outdated Show resolved Hide resolved
arviz/tests/base_tests/test_utils.py Outdated Show resolved Hide resolved
arviz/tests/base_tests/test_utils.py Outdated Show resolved Hide resolved
@AlexAndorra
Copy link
Contributor Author

@AlexAndorra
Copy link
Contributor Author

Renamed the filter arg to filter_vars, both to keep pylint happy and to have a more explicit API. Should be ready to merge now 🎉

@OriolAbril OriolAbril merged commit ebd5231 into arviz-devs:master Apr 27, 2020
@AlexAndorra AlexAndorra deleted the add-filter-feature branch May 13, 2020 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants