Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V4 update test framework for distributions random method 2nd attempt #4608

Conversation

matteo-pallini
Copy link
Contributor

@matteo-pallini matteo-pallini commented Apr 3, 2021

The recent distributions refactoring (#4508 and #4548) moves the logic to
get a random variable from a distribution to aesara.
This relies on numpy and scipy random variables implementation.
Given this change the main thing worth testing on the PyMC side is that
the PyMC parametrization is sensible given the one on the Aesara side
(effectively the numpy/scipy one)

More details can be found on issue #4554

  • what are the (breaking) changes that this PR makes? No breaking changes, only fixing currently broken tests
  • are the changes—especially new features—covered by tests and docstrings? This change is only affecting tests
  • Is there any other test not relevant anymore given the refactoring?

@matteo-pallini matteo-pallini changed the title WIP: V4 update test framework for distributions random method 2nd attempt V4 update test framework for distributions random method 2nd attempt Apr 3, 2021
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start. We should also check that .eval() actually works even if just for a single sample.

I think @brandonwillard had some code in the original issue asserting that the numbers generated by our distribution and the scipy one were the same when given the same random seeds. That feels like a stronger test.

PS: Checking that a dozen or so samples match should be enough as we already asserted the right parameters are being used.

What do you think?

pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
@ricardoV94
Copy link
Member

The exponential and gamma should now pass after #4576

@matteo-pallini
Copy link
Contributor Author

The exponential and gamma should now pass after #4576

Lovely, thanks for flagging

@matteo-pallini
Copy link
Contributor Author

matteo-pallini commented Apr 4, 2021

Looks like a good start. We should also check that .eval() actually works even if just for a single sample.

I think @brandonwillard had some code in the original issue asserting that the numbers generated by our distribution and the scipy one were the same when given the same random seeds. That feels like a stronger test.

PS: Checking that a dozen or so samples match should be enough as we already asserted the right parameters are being used.

What do you think?

Sounds good. It's definitely a more robust test. Let me know if what added is in line with what you had in mind

@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from 53dfb75 to ef7584e Compare April 4, 2021 11:44
@ricardoV94
Copy link
Member

I left some small comments. Overall I think it looks fine and should definitely cover the pymc <- -> aesara parametrizations. After this PR we might still need something extra for the random ops that are implemented on our side, but that might be better done separately.

@ricardoV94
Copy link
Member

ricardoV94 commented Apr 4, 2021

Also, we can remove these lines from the previous PR: https://github.com/pymc-devs/pymc3/blob/7a7179276f8f60470dacc496d3c2add4b636e4a2/pymc3/tests/test_distributions_random.py#L1773-L1803

@ricardoV94
Copy link
Member

I think we are pretty much ready after you complete those final refactorings. Thanks for your work and patience.

Is there anything else you think should be changed?

@matteo-pallini
Copy link
Contributor Author

matteo-pallini commented Apr 5, 2021

I think we are pretty much ready after you complete those final refactorings. Thanks for your work and patience.

Is there anything else you think should be changed?

Happy to cover If this test framework would also cover well these future cases it could be worthwhile as well. Let's agree on the right way of doing it, if you are happy to. I left a comment with a potential approach on the thread above. I should probably thank you for your patience :-)

I should have addressed all the refactoring comments you left

@matteo-pallini matteo-pallini reopened this Apr 5, 2021
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from 1ca93bc to 598d0db Compare April 5, 2021 13:29
@ricardoV94 ricardoV94 added the v4 label Apr 7, 2021
matteo-pallini added a commit to matteo-pallini/pymc3 that referenced this pull request Apr 8, 2021
The refactoring should make it possible testing both the distribution
parametrization and sampled values according to need, as well as any
other future test. More details on PR pymc-devs#4608
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from 598d0db to 5e67189 Compare April 8, 2021 23:39
matteo-pallini added a commit to matteo-pallini/pymc3 that referenced this pull request Apr 10, 2021
The refactoring should make it possible testing both the distribution
parametrization and sampled values according to need, as well as any
other future test. More details on PR pymc-devs#4608
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from 5e67189 to e4901ee Compare April 10, 2021 12:03
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I left some general comments, let me know what you think :)

pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
"expected_rv_op_params": {"mu": 1.5, "beta": 3.0},
"expected_dist": self._get_scipy_distribution("gumbel_r"),
"expected_dist_params": {"loc": 1.5, "scale": 3.0},
"size": 15,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this needed size explicitly?

Copy link
Contributor Author

@matteo-pallini matteo-pallini Apr 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I can remove it, given that the default value is 15 anyway. it's not needed, but it shows very explicitly how to define the size

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave it out. Specially since the BaseTests parametrize size separately.

pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved
@ricardoV94
Copy link
Member

Sounds like you are on the right track.

I would focus on integrating the BaseTest with your framework and give it a more explicit name to make it obvious it is testing shapes of draws. If you find the inheriting class logic nicer, that would certainly be a plus.

After this it's perhaps time to get a second review, merge, and start using it with the newly refactored distributions. It will be more obvious whether more types of tests are needed or if these suffice.

matteo-pallini added a commit to matteo-pallini/pymc3 that referenced this pull request Apr 11, 2021
The refactoring should make it possible testing both the distribution
parametrization and sampled values according to need, as well as any
other future test. More details on PR pymc-devs#4608
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from e4901ee to 69592c3 Compare April 11, 2021 10:23
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments. The way the discrete_weibul_rng_fn is implemented looks a bit complicated. Do you think it might read better if we unpack that lambda to an explicit function?

Otherwise things LGTM!

matteo-pallini and others added 18 commits April 24, 2021 00:00
The distributions refactoring moves the random variable sampling to
aesara. This relies on numpy and scipy random variables implementation.
So, now the only thing we care about testing is that the parametrization
on the PyMC side is sendible given the one on the Aesara side
(effectively the numpy/scipy one)

More details can be found on issue pymc-devs#4554
pymc-devs#4554
Most of the random variable logic has been moved to aesara, as well as
most of the relative tests. More details can be found on issue pymc-devs#4554
Also mark test_categorical as expected to fail due to bug on aesara
side. The bug is going to be fixed with 2.0.5 release, so we need to
bump the version for categorical and the test to pass.
- replace list of tuples with dict
- rename 1 method
- move pymc_dist as first argument in function call
- replace list(params) with params.copy()
The refactoring should make it possible testing both the distribution
parametrization and sampled values according to need, as well as any
other future test. More details on PR pymc-devs#4608
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch 3 times, most recently from 56253dc to c5661ba Compare April 24, 2021 09:26
@matteo-pallini matteo-pallini force-pushed the v4-update-test-framework-for-distributions-random-method-2nd-attempt branch from c5661ba to 9b52e1b Compare April 24, 2021 10:22
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, much more readable and obvious how to apply to new distributions going into the future!

@ricardoV94
Copy link
Member

Thanks a lot @DRabbit17. Looking forward to your next PR!

@matteo-pallini
Copy link
Contributor Author

Thanks for reviewing and helping in making the final version much better

twiecki pushed a commit that referenced this pull request Jun 5, 2021
* Update tests following distributions refactoring

The distributions refactoring moves the random variable sampling to
aesara. This relies on numpy and scipy random variables implementation.
So, now the only thing we care about testing is that the parametrization
on the PyMC side is sendible given the one on the Aesara side
(effectively the numpy/scipy one)

More details can be found on issue #4554
#4554

* Change tests for more refactored distributions.

More details can be found on issue #4554
#4554

* Change tests for refactored distributions

More details can be found on issue #4554
#4554

* Remove tests for random variable samples shape and size

Most of the random variable logic has been moved to aesara, as well as
most of the relative tests. More details can be found on issue #4554

* Fix test for half cauchy, renmae mv normal tests and add test for
Bernoulli

* Add test checking PyMC samples match the aesara ones

Also mark test_categorical as expected to fail due to bug on aesara
side. The bug is going to be fixed with 2.0.5 release, so we need to
bump the version for categorical and the test to pass.

* Move Aesara to 2.0.5 to include Gumbel distribution

* Enamble exponential and gamma tests following bug-fix

* Enable categorical test following aesara version bump to 2.0.5 and relative bug-fix

* Few small cosmetic changes:
- replace list of tuples with dict
- rename 1 method
- move pymc_dist as first argument in function call
- replace list(params) with params.copy()

* Remove redundant tests

* Further refactoring

The refactoring should make it possible testing both the distribution
parametrization and sampled values according to need, as well as any
other future test. More details on PR #4608

* Add size tests to new rv testing framework

* Add tests for multivariate and for univariate multi-parameters

* remove test already covered in aesara

* fix few names

* Remove "distribution" from test class names

* Add discrete Weibull, improve Beta and some minor refactoring

* Fix typos in checks naming and add sanity check

Co-authored-by: Ricardo <ricardo.vieira1994@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants