Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky tests #32

Merged
merged 1 commit into from
May 25, 2022
Merged

Fix flaky tests #32

merged 1 commit into from
May 25, 2022

Conversation

rlouf
Copy link
Member

@rlouf rlouf commented May 10, 2022

The tests for the distribution rely on a specific random number generator implementation and seed. These tests broke as the result of a change upstream in aesara 2.6.5 (I suspect the change happened here aesara-devs/aesara#939). This should not happen and we thus need better tests for the distributions.

@rlouf rlouf requested a review from zoj613 May 10, 2022 09:38
@codecov
Copy link

codecov bot commented May 10, 2022

Codecov Report

Merging #32 (288e5cd) into main (0a8cc00) will decrease coverage by 5.40%.
The diff coverage is n/a.

❗ Current head 288e5cd differs from pull request most recent head e1af18e. Consider uploading reports for the commit e1af18e to get more accurate results

@@             Coverage Diff             @@
##              main      #32      +/-   ##
===========================================
- Coverage   100.00%   94.59%   -5.41%     
===========================================
  Files            2        2              
  Lines          185      185              
  Branches        11       11              
===========================================
- Hits           185      175      -10     
- Misses           0        9       +9     
- Partials         0        1       +1     
Impacted Files Coverage Δ
aemcmc/dists.py 79.59% <0.00%> (-20.41%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a8cc00...e1af18e. Read the comment docs.

@zoj613
Copy link
Member

zoj613 commented May 10, 2022

If there are no random stream preservation guarantees by the aesara package, then it means we should not rely on the seed for any tests. We should think about how we can verify the output of those distributions without relying on the seed.

@zoj613
Copy link
Member

zoj613 commented May 10, 2022

Which properties of the multivariate normal do you think we could test for in the unittest @rlouf ?

EDIT: I think we could test if the sample mean is close to the expected value of the distribution as described in the implementation? I think that would mean we increase the sample size in the unittest. Or even include the standard normality tests like the Shapiro test? scipy has an implementation at https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.shapiro.html. We could generate a large sample and test if the pvalue returned is less than a particular threshold (usually 0.05).

EDIT2: This might not work since the above test against a univariate normal distribution.

Cc @brandonwillard

tests/test_dists.py Outdated Show resolved Hide resolved
tests/test_dists.py Outdated Show resolved Hide resolved
@rlouf
Copy link
Member Author

rlouf commented May 10, 2022

The first thing we want to make sure is that the output samples are of the correct shape. It is always hard to test for correctness of the output of random functions, but we can probably rely on some theoretical result to design a simple test, or use an existing test.

@rlouf
Copy link
Member Author

rlouf commented May 10, 2022

I've added a test to check that the samples' shape is correct for each sampler. Should we merge this so we can rebase the other PRs and open a separate issue?

@zoj613
Copy link
Member

zoj613 commented May 10, 2022

I've added a test to check that the samples' shape is correct for each sampler. Should we merge this so we can rebase the other PRs and open a separate issue?

I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited.

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need some simple moment-based tests (e.g. mean and (co)variance). It's fine if there's some dependence on an RNG state (e.g. a fixed error window/range of acceptance/etc. that may not pass for all RNG states), since that's basically unavoidable anyway, we just don't want to depend on the exact values of a specific RNG state (e.g. tests that only pass for a single/few RNG states).

@rlouf rlouf force-pushed the skip-flaky-tests branch 2 times, most recently from f5fab04 to 9c217ca Compare May 12, 2022 14:54
@rlouf
Copy link
Member Author

rlouf commented May 12, 2022

@zoj613 Could you have a look at the PR? I added some simple moment-based tests but I am not sure about the formulas I used to compute the mean and variance from the parameters in the last two tests.

@rlouf rlouf requested a review from zoj613 May 12, 2022 14:59
Tests for the distribution rely on a specific random number generator
implementation and seed, and they recently broke after a change upstream
in aesara 2.6.5.

In this commit we simplify the tests by only checking the shape of the samples.
@rlouf
Copy link
Member Author

rlouf commented May 24, 2022

I simplified the tests that were not passing; we can merge this so we can move forward on the other PRs and open an issue to add more tests on the distributions.

@zoj613
Copy link
Member

zoj613 commented May 24, 2022

I simplified the tests that were not passing; we can merge this so we can move forward on the other PRs and open an issue to add more tests on the distributions.

I can't review your previous tests because of the force-push that got rid of the commits so i'll post here:

# line in function testing cong2017
var = 1.0 / (A + np.dot(phi.T, np.dot(omega, phi)))

From the reference, page 17, the precision matrix is not diagonal. Only A and Omega are (but not phi). var in your case is supposed to an actual matrix inverse via at.inv(A + np.dot(phi.T, np.dot(omega, phi))). Also with mean = var * np.dot(phi.T, omega) * t, we have to make sure we use a matrix-vector multiplication but it appears we used an elementwise product here (im not too sure).

I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited. I will take a look at them and see if I can come up with better tests.

@rlouf
Copy link
Member Author

rlouf commented May 24, 2022

I opened #33 to keep track of this. Can someone approve and merge unless there is something else ?

@zoj613
Copy link
Member

zoj613 commented May 24, 2022

@brandonwillard I think we can continue this in a separate PR. i've assigned that to myself. Merging is blocked due to your previosly requested change.

@rlouf rlouf merged commit 9510be2 into aesara-devs:main May 25, 2022
@rlouf rlouf deleted the skip-flaky-tests branch May 25, 2022 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants