Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MvNormal.random #4207

Merged
merged 13 commits into from
Nov 29, 2020
Merged

Fix MvNormal.random #4207

merged 13 commits into from
Nov 29, 2020

Conversation

Sayam753
Copy link
Member

@Sayam753 Sayam753 commented Nov 2, 2020

Thank your for opening a PR!

Depending on what your PR does, here are a few things you might want to address in the description:

@twiecki twiecki requested a review from lucianopaz November 2, 2020 18:19
@Sayam753
Copy link
Member Author

Sayam753 commented Nov 3, 2020

The shape issues discussed in #4206 , directly affect the failing of test pymc3/tests/test_mixture.py::TestMixture::test_sample_prior_and_posterior in this PR. Because I assumed, no matter what, random draws from PyMC3 RVs always have size (or sample_shape) as their first dimensions. This is not the case for a few distributions when size=1.

So, I imagine #4206 getting solved first.
There are two more tests failing which I am yet to figure out why.

@codecov
Copy link

codecov bot commented Nov 7, 2020

Codecov Report

Merging #4207 (3d2ebd7) into master (edbafaa) will increase coverage by 0.12%.
The diff coverage is 92.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4207      +/-   ##
==========================================
+ Coverage   87.44%   87.57%   +0.12%     
==========================================
  Files          88       88              
  Lines       14342    14321      -21     
==========================================
- Hits        12542    12541       -1     
+ Misses       1800     1780      -20     
Impacted Files Coverage Δ
pymc3/distributions/multivariate.py 83.33% <92.00%> (+2.16%) ⬆️
pymc3/distributions/mixture.py 88.31% <0.00%> (-1.04%) ⬇️
pymc3/distributions/distribution.py 94.57% <0.00%> (+0.25%) ⬆️
pymc3/parallel_sampling.py 86.79% <0.00%> (+0.94%) ⬆️

@AlexAndorra
Copy link
Contributor

This looks like a great improvement 🤩 What's left on this here @Sayam753 ? What do you need to push it through the finish line?

Copy link
Contributor

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Sayam753 and sorry for the delay in reviewing. I left bunch of comments on things that we should change/address before being able to merge. I think that the PR looks promising but we can improve it

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
@Sayam753 Sayam753 marked this pull request as draft November 19, 2020 14:39
@Sayam753
Copy link
Member Author

Hi @AlexAndorra
Sorry for the late reply. I have figured out more test cases which this current approach cannot handle, especially with tau parametrization. The three things left are -

  1. Inserting batch dimensions into every parametrization.
  2. Handle sample and batch dimensions for tau parametrization because cholesky and solve_triangular functions from scipy.linalg only operate on 2d matrices. But the samples drawn can have more than 2 dimensions.
  3. Writing test cases.
    So, I have converted this PR as a draft to work on the these tasks.

@AlexAndorra
Copy link
Contributor

Thanks for the update @Sayam753, and for you work on this, which would clearly be a great improvement 🤩

@Spaak
Copy link
Member

Spaak commented Nov 23, 2020

@Sayam753 it would be great to get these changes in release 3.10. If I read the above correctly, we're nearly there, right? Anything you need to get this ready for review? (I'm guessing merging #4243; anything else?)

@Sayam753
Copy link
Member Author

Hi @Spaak ,
I am also getting excited for 3.10 release. I only need to write test cases. Let's ping @lucianopaz for another round of review.

The underneath reason behind failing test cases earlier was a wrong condition, which I have corrected. I came across the str representation issue in the pdb debugger. So, #4243 is not a blocker.

@Sayam753
Copy link
Member Author

I also invite @junpenglao to review.

@twiecki twiecki requested a review from lucianopaz November 27, 2020 09:56
@twiecki
Copy link
Member

twiecki commented Nov 27, 2020

@Sayam753 if this is ready for review from your end, can you stop from draft?

@michaelosthege michaelosthege marked this pull request as ready for review November 27, 2020 10:31
@michaelosthege
Copy link
Member

These changes will be covered by the tests from #4206

@Sayam753
Copy link
Member Author

As per discussions with @lucianopaz , I have changed the broadcasting logic to use pm.distributions.shapes_utils.broadcast_dist_samples_to function. That simplified a lot of code.

But there are still weird things going on with broadcast_dist_samples_to function. Here is an example

>>> import pymc3 as pm
>>> import numpy as np
>>> from pymc3.distributions.shape_utils import broadcast_dist_samples_to
>>>
>>> cov = np.eye(3)
>>> broad_samples = broadcast_dist_samples_to(to_shape=(10, 3, 3), samples=[cov], size=(3, ))[0]
>>> broad_samples.shape
(3, 10, 3, 3)
>>> broad_samples[0, 0]
array([[1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.]])

I expected to see identity matrix here. Is this a bug or am I missing something?

@Sayam753
Copy link
Member Author

The tests passed. The PR is ready for review.
But, I am not fully convinced with the use of broadcast_dist_samples_to function. So, I would like to spend some time playing around with different functions in shape_utils.py.

@Sayam753
Copy link
Member Author

pm.distributions.shapes_utils.broadcast_dist_samples_to function seems a blocker with the above case.
@lucianopaz can you help to interpret this behaviour?

@lucianopaz
Copy link
Contributor

As per discussions with @lucianopaz , I have changed the broadcasting logic to use pm.distributions.shapes_utils.broadcast_dist_samples_to function. That simplified a lot of code.

But there are still weird things going on with broadcast_dist_samples_to function. Here is an example

>>> import pymc3 as pm
>>> import numpy as np
>>> from pymc3.distributions.shape_utils import broadcast_dist_samples_to
>>>
>>> cov = np.eye(3)
>>> broad_samples = broadcast_dist_samples_to(to_shape=(10, 3, 3), samples=[cov], size=(3, ))[0]
>>> broad_samples.shape
(3, 10, 3, 3)
>>> broad_samples[0, 0]
array([[1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.]])

I expected to see identity matrix here. Is this a bug or am I missing something?

The problem is that the first dimension of the cov array is being interpreted as the sample_shape. You will have to add some checks based on the symbolic theano tensors ndim to ensure that the first dimension of cov and mu isn't sample shape just by chance.

@Sayam753
Copy link
Member Author

You will have to add some checks based on the symbolic theano tensors ndim to ensure that the first dimension of cov and mu isn't sample shape just by chance.

This is confusing. We have theano tensors defined at distribution level. So, they can only represent batch and event shapes. Why should there be a check for sample_shape in front, with theano tensors? Please share an example/reference to help me understand this.

@Sayam753
Copy link
Member Author

Sayam753 commented Nov 28, 2020

So, does this mean, we need to have checks to prevent batch/event shape being misinterpreted as sample shape?

@lucianopaz
Copy link
Contributor

So, does this mean, we need to have checks to prevent batch/event shape being misinterpreted as sample shape?

Yes, exactly. In the example that you posted before, the covariance matrix event shape was (3, 3), but the first dimension was interpreted as the sample shape because it happened to match the requested size.

What i mean that we could do is something like

core_dims = self.cov.ndim
sampled_dims = cov.ndim
if core_dims == sampled_dims:
    # We are sure that this means that the drawn cov only has batch and event dimensions
    # We can prevent the first dimensions from being misinterpreted by adding singleton dimensions to the left of cov
    cov = np.reshape(cov, (1,)*len(sample_shape) + cov.shape)
else:
    # We don't do anything because we assume the extra dimensions come from sample_shape
    pass
# At this point we use the shape utils broadcasting

We should also have to do something like this for mu.

Copy link
Member Author

@Sayam753 Sayam753 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, the PR is ready to get merged from my side. I have written test cases for related issues and for different combinations of shapes as well. Ping @lucianopaz for final review.

Just a small comment below.

pymc3/tests/test_distributions_random.py Show resolved Hide resolved
@twiecki twiecki merged commit 1c290ef into pymc-devs:master Nov 29, 2020
@twiecki
Copy link
Member

twiecki commented Nov 29, 2020

Thanks @Sayam753 this was a big one!

@Sayam753
Copy link
Member Author

Sayam753 commented Nov 29, 2020

Awesome. @twiecki this closes #4185 and #3706
Also, first half of #3758 of resolving nans is fixed. I am still thinking about the other half.

@MarcoGorelli
Copy link
Contributor

Hi @Sayam753

Is it now always necessary to specify shape when using MvNormal? e.g.

pm.MvNormal.dist(mu=np.zeros(3), cov=np.eye(3)).random()

now gives

Traceback (most recent call last):
  File "t.py", line 4, in <module>
    pm.MvNormal.dist(mu=np.zeros(3), cov=np.eye(3)).random()
  File "/home/marco/pymc3-dev/pymc3/distributions/multivariate.py", line 273, in random
    param = np.broadcast_to(param, shape=output_shape + dist_shape[-1:])
  File "<__array_function__ internals>", line 5, in broadcast_to
  File "/home/marco/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/numpy/lib/stride_tricks.py", line 180, in broadcast_to
    return _broadcast_to(array, shape, subok=subok, readonly=True)
  File "/home/marco/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/numpy/lib/stride_tricks.py", line 118, in _broadcast_to
    raise ValueError('cannot broadcast a non-scalar to a scalar array')
ValueError: cannot broadcast a non-scalar to a scalar array

, but passing shape=3 makes it work again

@Sayam753
Copy link
Member Author

Yes. It is necessary. In fact, for each distribution it is important as per discussions with @lucianopaz .
That remembered me to open an issue for GP notebooks which are using this distribution and do not specify a shape parameter.

@MarcoGorelli
Copy link
Contributor

GP notebooks which are using this distribution and do not specify a shape parameter

That's where I ran into this :) Sure, I'll make an issue for this in pymc-examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants