Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored Wishart and MatrixNormal distribution #4777

Merged
merged 1 commit into from
Jul 2, 2021

Conversation

kc611
Copy link
Contributor

@kc611 kc611 commented Jun 17, 2021

This PR refactors the Wishart and MatrixNormal multivariate distributions.

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/tests/test_distributions.py Outdated Show resolved Hide resolved
@kc611 kc611 marked this pull request as ready for review June 19, 2021 07:20
Comment on lines 1318 to 1379
sizes_to_check = [None, (), 1, (1,), 5, (4, 5), (2, 4, 2)]
sizes_expected = [
(3, 3),
(3, 3),
(1, 3, 3),
(1, 3, 3),
(5, 3, 3),
(4, 5, 3, 3),
(2, 4, 2, 3, 3),
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What extra logic (e.g. implemented in WishartRV) are all these sizes testing?

Remember, we don't want the test time to grow larger without proportionally increasing the actual test coverage (and that coverage doesn't include NumPy/SciPy's code). This test module is easily one of the most egregious violators of that principal, so we should always be a little more critical of additions to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be checking only for edge cases like None and () ? I just took the sizes_to_check list from the original function in BaseTestDistribution where the sizes are being checked by default. (And constructed the corresponding expected list.)

Copy link
Member

@ricardoV94 ricardoV94 Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To echo @kc611, this is not actually testing more than what the default test does, just setting the specific parameters of the size test, which usually have to be adapted for multivariate distributions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to both: This implementations is very much in line with the defaults.
The the BaseTestDistribution for sure takes a sledgehammer to break a nut, but that problem is out of scope of this PR.

Copy link
Contributor

@brandonwillard brandonwillard Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing admittedly poor defaults forward is a great way to get into a bad situation, and an even better way to get deeper into existing ones.

Also, the answer is not one big PR that removes all the testing redundancies; given the knowledge and effort required for that, it's quite unrealistic. Plus, that won't prevent the problem from reappearing again and again after such a heroic PR.

At the very least, we need to stop adding to the testing redundancies on a case-by-case basis, and we can do that by looking for it during reviews; otherwise, we're just hoping that contributors will do that for us, and that's not how things have turned out.

Regarding these exact tests, if they're only targeting WishartRV.rng_fn, it looks like we only need a test for an empty and non-empty size (and perhaps one other), because that's the only relevant condition in the code added by this PR. Every other value besides the first two appears to be testing the code in RandomVariable and scipy.stats.wishart.rvs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no harm in not testing the [ (), (1,), 5, (2, 4, 2)] cases if we are testing for [None, 1, (4, 5)] cases since they're pretty much the same thing with different values. (Updated accordingly)

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Is anything else missing?

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
@kc611 kc611 force-pushed the add_wish branch 2 times, most recently from cd4c3c6 to fe07893 Compare July 1, 2021 18:11
Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kc611 take a look at my comments and tell us when you see this as ready to merge :)

pymc3/distributions/multivariate.py Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
Comment on lines 1318 to 1379
sizes_to_check = [None, (), 1, (1,), 5, (4, 5), (2, 4, 2)]
sizes_expected = [
(3, 3),
(3, 3),
(1, 3, 3),
(1, 3, 3),
(5, 3, 3),
(4, 5, 3, 3),
(2, 4, 2, 3, 3),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to both: This implementations is very much in line with the defaults.
The the BaseTestDistribution for sure takes a sledgehammer to break a nut, but that problem is out of scope of this PR.

@kc611 kc611 force-pushed the add_wish branch 2 times, most recently from 65b49fc to 80879e7 Compare July 2, 2021 05:55
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's wait for the tests. Great work, these two were a tough group to crack

@michaelosthege michaelosthege merged commit cda2371 into pymc-devs:main Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants