-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky tests #32
Fix flaky tests #32
Conversation
Codecov Report
@@ Coverage Diff @@
## main #32 +/- ##
===========================================
- Coverage 100.00% 94.59% -5.41%
===========================================
Files 2 2
Lines 185 185
Branches 11 11
===========================================
- Hits 185 175 -10
- Misses 0 9 +9
- Partials 0 1 +1
Continue to review full report at Codecov.
|
If there are no random stream preservation guarantees by the aesara package, then it means we should not rely on the seed for any tests. We should think about how we can verify the output of those distributions without relying on the seed. |
Which properties of the multivariate normal do you think we could test for in the unittest @rlouf ? EDIT: I think we could test if the sample mean is close to the expected value of the distribution as described in the implementation? I think that would mean we increase the sample size in the unittest. Or even include the standard normality tests like the Shapiro test? scipy has an implementation at https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.shapiro.html. We could generate a large sample and test if the pvalue returned is less than a particular threshold (usually 0.05). EDIT2: This might not work since the above test against a univariate normal distribution. |
The first thing we want to make sure is that the output samples are of the correct shape. It is always hard to test for correctness of the output of random functions, but we can probably rely on some theoretical result to design a simple test, or use an existing test. |
I've added a test to check that the samples' shape is correct for each sampler. Should we merge this so we can rebase the other PRs and open a separate issue? |
I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need some simple moment-based tests (e.g. mean and (co)variance). It's fine if there's some dependence on an RNG state (e.g. a fixed error window/range of acceptance/etc. that may not pass for all RNG states), since that's basically unavoidable anyway, we just don't want to depend on the exact values of a specific RNG state (e.g. tests that only pass for a single/few RNG states).
f5fab04
to
9c217ca
Compare
@zoj613 Could you have a look at the PR? I added some simple moment-based tests but I am not sure about the formulas I used to compute the mean and variance from the parameters in the last two tests. |
Tests for the distribution rely on a specific random number generator implementation and seed, and they recently broke after a change upstream in aesara 2.6.5. In this commit we simplify the tests by only checking the shape of the samples.
I simplified the tests that were not passing; we can merge this so we can move forward on the other PRs and open an issue to add more tests on the distributions. |
I can't review your previous tests because of the force-push that got rid of the commits so i'll post here: # line in function testing cong2017
var = 1.0 / (A + np.dot(phi.T, np.dot(omega, phi))) From the reference, page 17, the precision matrix is not diagonal. Only A and Omega are (but not phi). I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited. I will take a look at them and see if I can come up with better tests. |
I opened #33 to keep track of this. Can someone approve and merge unless there is something else ? |
@brandonwillard I think we can continue this in a separate PR. i've assigned that to myself. Merging is blocked due to your previosly requested change. |
The tests for the distribution rely on a specific random number generator implementation and seed. These tests broke as the result of a change upstream in aesara 2.6.5 (I suspect the change happened here aesara-devs/aesara#939). This should not happen and we thus need better tests for the distributions.