Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend unittests for some distributions. #42

Merged
merged 3 commits into from
Jul 7, 2022

Conversation

zoj613
Copy link
Member

@zoj613 zoj613 commented Jun 30, 2022

This commit fixes/completes/extends previously minimal unittests for
some of the available probability distribution samplers, which include:

  • multivariate_normal_cong2017
  • multivariate_normal_bhattacharya2016.

This is a followup of #32 and closes #33

@zoj613 zoj613 added the CI Continuous Integration label Jun 30, 2022
@zoj613 zoj613 requested review from brandonwillard and rlouf June 30, 2022 15:53
@codecov
Copy link

codecov bot commented Jun 30, 2022

Codecov Report

Merging #42 (97f907c) into main (7e888a0) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main       #42   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            4         4           
  Lines          295       288    -7     
  Branches        20        20           
=========================================
- Hits           295       288    -7     
Impacted Files Coverage Δ
aemcmc/dists.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e888a0...97f907c. Read the comment docs.

@zoj613
Copy link
Member Author

zoj613 commented Jun 30, 2022

For what its worth: the multivariate_normal_bhattacharya2016 function is a special case of multivariate_normal_cong2017 for when the omega parameter is the identity matrix, so it can be written as:

def multivariate_normal_bhattacharya2016(rng, D, phi, alpha):
    return multivariate_normal_cong2017(rng, 1 / D, at.ones(phi.shape[0]), phi, alpha)

So not sure if this is worth a refactor and that the overhead of creating the at.ones and the extra steps inside the multivariate_normal_cong2017 function wont affect performance.

Cc @rlouf @brandonwillard

@zoj613
Copy link
Member Author

zoj613 commented Jun 30, 2022

BTW @rlouf while working on this PR today I realized that the tests in #32 previously failed for two reasons:

  1. you were using passing phi as a transpose of the expected matrix in the cong2017 function. Also the phi need for the other one should be computed from the cholesky since omega is the identity (so the eigen decomposition was not going to work since it produces a diagonal matrix that isnt the identity).
  2. the true variance should have been the diagonal entries of the covariance matrix of the input data so that the samples from the scan function can be correctly compared to those entries.

Which leads me to think that the dist.py module should get another PR (or extra commits on this one) that elaborate a bit more on the documentation about the expected input and how it could be generated, as well as some typehints.

@rlouf
Copy link
Member

rlouf commented Jul 1, 2022

Thank you! I have not seen multivariate_normal_bhattacharya2016 used anywhere. If so we should remove it altogether.

For the documentation it was not always easy to find the corresponding parametrization in the references. A short explanation of the meaning of each parameter and how they relate to mean/covariance would be very useful.

Besides these two point this looks good to me.

@zoj613
Copy link
Member Author

zoj613 commented Jul 1, 2022

Thank you! I have not seen multivariate_normal_bhattacharya2016 used anywhere. If so we should remove it altogether.

For the documentation it was not always easy to find the corresponding parametrization in the references. A short explanation of the meaning of each parameter and how they relate to mean/covariance would be very useful.

Besides these two point this looks good to me.

Do you feel these two points should be a separate PR or part of this one?

@rlouf
Copy link
Member

rlouf commented Jul 1, 2022

They can be part of this one; it will take you more time to open a separate PR than make these changes.

@zoj613
Copy link
Member Author

zoj613 commented Jul 2, 2022

@rlouf i've addressed your last 2 points in the latest commit. Please take a look at this again when you can,.

@zoj613 zoj613 self-assigned this Jul 2, 2022
@rlouf
Copy link
Member

rlouf commented Jul 2, 2022

The changes look good to me, however you will need to make some changes to your commits' structure:

  1. No need for TST/DOC, etc in the commit title
  2. Can you break down you last commit in two: removing one function + adding some documentation? These are two separate changes.

zoj613 added 3 commits July 3, 2022 00:59
This commit fixes/completes/extends previously minimal unittests for
some of the available probability distribution samplers, which include:
- `multivariate_normal_cong2017`
- `multivariate_normal_bhattacharya2016`.
The algorithm in Bhattacharya (2016) can be seen as a special case of the
`multivariate_normal_cong2017` function where `omega` is the identity
matrix. This commit removes the redundant implementation of the
Bhattacharya (2016) algorithm and then adds a unittest for this special
case.
@zoj613 zoj613 requested a review from rlouf July 2, 2022 23:03
Copy link
Member

@rlouf rlouf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

phi: TensorVariable,
t: TensorVariable,
) -> TensorVariable:
r"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: docstring should start after the """

Copy link
Member Author

@zoj613 zoj613 Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mainly a preference thing. The current way is still correct according the numpydoc convention and is used all over numpy and scipy source code. The pydocstyle linter doesn't complain either on my editor.

Copy link
Member

@rlouf rlouf Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you start a repo every reasonable choice is certainly a question of preference, in an existing codebase it is a matter of consistency. This format is used in other aesara-related projects and I don't see a reason to do something different here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern but dists.py is the first module commited into the codebase and if you look at the other functions in it, they use the same docstring convention. That is about as consistent as can be in this case. Using the suggested preference would introduce an inconsistent docstring style in the module. So the point made here is a bit contradictory.

Copy link
Member

@rlouf rlouf Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only two consistent choices here: change the docstrings in every other files in the repository, or press DEL twice in this file. Whatever you think is best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the aesara-devs projects follow the code and documentation standards of Aesara. In the case of docstrings, we use NumPy's format—more or less. (The link in our documentation is broken; it should be this.)

We have some legacy docstrings in the Aesara repo that go against this standard, but those need to be fixed.

@brandonwillard
Copy link
Member

brandonwillard commented Jul 4, 2022

FYI: the aesara-devs repos don't use the "<tag>: <description>." PR title format, only the "<description>" part. Descriptive tags are handled by the labels. (This is only worth mentioning because we generate our release notes automatically from the PR titles.)

@zoj613 zoj613 changed the title TST: Extend unittests for some distributions. Extend unittests for some distributions. Jul 4, 2022
@brandonwillard brandonwillard merged commit 2a50b36 into aesara-devs:main Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add test to check correctness of samples from multivariate normal distributions
3 participants