Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAR random variables #4596

Merged
merged 31 commits into from
Jul 27, 2021
Merged

CAR random variables #4596

merged 31 commits into from
Jul 27, 2021

Conversation

aerubanov
Copy link
Contributor

@aerubanov aerubanov commented Mar 31, 2021

close #4518
I Changed CAR distribution class to compatible with v4 RandomVariable and implemented CAR random variable class. But now I need some advises for further improvements.

@brandonwillard, I have few questions about random variable for CAR:

  • if some parameter of random variable have param that may be scalar or vector (like mu, tau and alpa), what I should specify in ndims_params? If I pass scalar instead vector an error occurred.
  • Should I somehow handle size param in rng_fn method?
  • Where I should store intermediate variables (like D and lam) that need to be calculated only once. Initially they were calculated in the __init__ method of distribution class and stored as distribution class atribute.

@ckrapu, Do you have any ideas for testing the implementation of the algorithm?

@brandonwillard
Copy link
Contributor

  • if some parameter of random variable have param that may be scalar or vector (like mu, tau and alpa), what I should specify in ndims_params? If I pass scalar instead vector an error occurred.

It sounds like you need to find a "minimally sufficient" space into which you can embed all the parameters. The real question is "What's the correspondence between the dimensions of these parameters?" For instance, if mu is a scalar, what should tau be, etc.?

  • Should I somehow handle size param in rng_fn method?

Yes, that's a mandatory parameter. You'll need to understand those dimensions above in order to implement this, though; otherwise, take a look at the implementation of MvNormalRV for a brute-force approach to implementing size.

  • Where I should store intermediate variables (like D and lam) that need to be calculated only once. Initially they were calculated in the __init__ method of distribution class and stored as distribution class atribute.

You can add those as parameters to the Op and use Op.__init__ to compute the derived quantities and hide them from the caller.

@aerubanov
Copy link
Contributor Author

It sounds like you need to find a "minimally sufficient" space into which you can embed all the parameters. The real question is "What's the correspondence between the dimensions of these parameters?" For instance, if mu is a scalar, what should tau be, etc.?

@brandonwillard No, my question was about another thing. The CAR class docs specifying that a float or array can be passed as these parameters, optionaly. How I should I handle this? Should I transform float to array in dist method? Or some other way?

@brandonwillard
Copy link
Contributor

brandonwillard commented Mar 31, 2021

Should I transform float to array in dist method? Or some other way?

If you have the parameter dimensions worked out already, then there's no question about whether or not you should—for instance—convert a scalar to a vector, matrix, etc. If it's a question of where to do the conversion, often that kind of thing is done in Op.make_node.

Assuming that's the case, and you need to convert a scalar to a vector, just add a broadcastable dimension to the scalar value. There are aesara.tensor.shape_pad* functions that will do this, as well as aesara.tensor.reshape.

(Aesara could probably use implementations of the numpy.atleast_*, too.)

N.B.: There are also some broadcasting helper functions used throughout the RandomVariable implementations in Aesara. For instance, check out broadcast_params.

@aerubanov
Copy link
Contributor Author

@brandonwillard , Thanks for the tips and quick response! I will make the necessary changes.

@ckrapu
Copy link
Contributor

ckrapu commented Apr 1, 2021

@ckrapu, Do you have any ideas for testing the implementation of the algorithm?

The testing strategy I used for the original CAR implementation was to make sure that the CAR logp and the equivalent multivariate normal with CAR-structured covariance matrix were equivalent, up to an additive constant. You can see how that is implemented here.

u += 1
L = scipy.linalg.cholesky_banded(Qb, lower=False)
z = rng.normal(size=W.shape[0], loc=mu)
samples = scipy.linalg.cho_solve_banded((L, False), z)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the desired signature of rng_fn, but it's not clear to me that this allows for sampling multiple CAR draws simultaneously. If that's not the case, then one-time solution to the permutation optimization problem in reverse_cuthill_mckee is not going to be better than simply sampling from a multivariate normal with the CAR covariance matrix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the size parameter is just what you need to implement the possibility of sampling multiple CAR draws simultaneously. So I should change my implementation of this method.

dtype = "floatX"
_print_name = ("CAR", "\\operatorname{CAR}")

def make_node(self, rng, size, dtype, *dist_params):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to explicitly specify the parameters (i.e. mu, W, etc.); otherwise, the signature is unnecessarily ambiguous.

Also, you'll likely need to convert those distribution parameters to Aesara variables using aesara.tensor.as_tensor[_variable].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brandonwillard, Should I do this convertion in the make_node method of RandomVareable instead of the dist method of CAR Distribution class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's always good to do that in Op.make_node.

@aerubanov
Copy link
Contributor Author

I add handling of size param in rng_fn method and move convertion to aesara tensor in make_node method. But now, when it trying to calculate logp for CAR distribution i get this error:

n[2]: import pymc3 as pm, numpy as np
Backend TkAgg is interactive backend. Turning interactive mode on.
In[3]: W = np.array([[1, 0, 0, 1, 0], [0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [1, 1, 0, 1, 0], [0, 0, 1, 0, 1]])
In[4]: pm.logpt(pm.CAR.dist([0, 0, 0, 0, 0], W, [0.5, 0.5, 0.5, 0,5], [1, 1, 1, 1, 1]),np.random.randn(5)).eval()
Traceback (most recent call last):
  File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3437, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-71c3b392bce9>", line 1, in <module>
    pm.logpt(pm.CAR.dist([0, 0, 0, 0, 0], W, [0.5, 0.5, 0.5, 0,5], [1, 1, 1, 1, 1]),np.random.randn(5)).eval()
  File "/home/anatoly/HDD/Projects/pymc3/pymc3/distributions/logp.py", line 165, in logpt
    rv_value = rv_var.type.filter_variable(rv_value.astype(rv_var.dtype))
  File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/type.py", line 258, in filter_variable
    raise TypeError(
TypeError: Cannot convert Type TensorType(float64, vector) (of Variable TensorConstant{[ 2.000324...65305157]}) into Type TensorType(float64, matrix). You can try to manually convert TensorConstant{[ 2.000324...65305157]} into a TensorType(float64, matrix).

I guess I just did something wrong on the last call. @brandonwillard , could you help me with these error?

I also plan to start figuring out how to write tests.Based on the discussion above, I think we can compare the sampling from CAR with the sampling from the equivalent multivariate normal with CAR-structured covariance matrix. But perhaps it is not the samples themselves that should be compared, but the statistics calculated from their distribution or something similar.

@ckrapu
Copy link
Contributor

ckrapu commented Apr 6, 2021

> TypeError: Cannot convert Type TensorType(float64, vector) (of Variable TensorConstant{[ 2.000324...65305157]}) into Type TensorType(float64, matrix). You can try to manually convert TensorConstant{[ 2.000324...65305157]} into a TensorType(float64, matrix).

If you try using np.random.randn(5,1) instead, does it work?

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
self.median = self.mode = self.mean = self.mu = at.as_tensor_variable(mu)
self.sparse = sparse
@classmethod
def dist(cls, mu, W, alpha, tau, sparse=False, *args, **kwargs):

if not W.ndim == 2 or not np.allclose(W, W.T):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumPy cannot be used here, because W could be a TensorVariable. You'll need to assert symmetry in a symbolic way—or simply remove the check for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aerubanov, this still needs to be resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brandonwillard , Thanks for the reminder, I made the changes.


def logp(self, value):
def logp(value, mu, W, alpha, tau, sparse=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for a sparse parameter here; plus, this option would only be useful to someone directly calling this function, which isn't too likely or common.

More importantly, you can determine whether or not W is sparse by inspection.

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
@brandonwillard
Copy link
Contributor

I pushed some of the recommendations I made above and re-enabled the old CAR test in test_distributions, so you'll need to rebase locally.

Otherwise, according to the old test, it looks like the log-likelihood could be working; however, that test needs an addition for at least one size > 2D.

Also, it looks like we need to add a CAR test to test_distributions_random for the random sampling in CARRV (I couldn't find one in there).

@aerubanov
Copy link
Contributor Author

Thanks for review, @brandonwillard !

Otherwise, according to the old test, it looks like the log-likelihood could be working; however, that test needs an addition for at least one size > 2D.

Yes, I will try to add test some tests for such case.

Also, it looks like we need to add a CAR test to test_distributions_random for the random sampling in CARRV (I couldn't find one in there).

I think about solution based on comparison samples from CAR and MultivariateNormal with CAR-structured covariance matrix, similar to what we do for logpt. I discused this idea with @ferrine , and he suggested to use Sliced-Wasserstein distance for comparison metric . What do you think?

@ricardoV94
Copy link
Member

ricardoV94 commented Apr 7, 2021

I think about solution based on comparison samples from CAR and MultivariateNormal with CAR-structured covariance matrix, similar to what we do for logpt. I discused this idea with @ferrine , and he suggested to use Sliced-Wasserstein distance for comparison metric . What do you think?

We have some tests for random methods that use the Kolgomorov-Smirnov test to assess the equivalence between pymc3 and a reference number generator (which could be the MultivariateNormal in your case). Maybe you can adapt the logic from there: https://github.com/pymc-devs/pymc3/blob/e5c42b47e43940837afe171f7a30c8ea89f54ed2/pymc3/tests/test_distributions_random.py#L60

@aerubanov
Copy link
Contributor Author

@ricardoV94 , thank you, I will check this.

@aerubanov
Copy link
Contributor Author

I added a test for the rng_fn method based on the logic from the pymc3_random() function in pymc3/pymc3/tests/test_distributions_random.py, but it fails =(. Now I'm trying to fix it.

@brandonwillard , @ricardoV94, do you have any ideas what I am doing wrong?

@ricardoV94
Copy link
Member

Does the test pass if you duplicate the CAR and compare against itself, instead of against the MvNormal? If it does it means the two distributions you have now are not really equivalent

@aerubanov
Copy link
Contributor Author

@ricardoV94 , yes, if I duplicate CAR distribution, the test passes. But as @ckrapu pointed out, this test is not correct for this distribution. So I will try to use the options he suggested.

@aerubanov
Copy link
Contributor Author

I experimented with different W matrix formats, and sometimes the test passed, but sometimes failed. So, it seems that the KS test is actually not appropriate here.

@brandonwillard
Copy link
Contributor

I experimented with different W matrix formats, and sometimes the test passed, but sometimes failed. So, it seems that the KS test is actually not appropriate here.

If they're only failing within a smallish window near the numerical limit/precision, then that's fine. Also, we'll need to seed the tests, which will obviously help.

@aerubanov
Copy link
Contributor Author

I changed test for CAR rng_fn method. Now we compare samples for each dimensions piece by piece. and looks like it work better. @brandonwillard,

Also, we'll need to seed the tests, which will obviously help.

I added numpy random seed setting before starting to generate samples. Is there anything else that needs to be added?

Comment on lines 1957 to 1958
W_sparse = scipy.sparse.csr_matrix(W)
W = aesara.sparse.as_sparse_variable(W_sparse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
W_sparse = scipy.sparse.csr_matrix(W)
W = aesara.sparse.as_sparse_variable(W_sparse)
W = aesara.sparse.as_sparse_variable(W)

Just like before, W could be a TensorVariable, so calling scipy.sparse.csr_matrix isn't sound.

Also, as I mentioned in the other comment, the sparse option is unnecessary; the caller should provide a sparse variable if they want to use one, and the logic here should handle both cases.

@ricardoV94 ricardoV94 mentioned this pull request May 12, 2021
26 tasks
@aerubanov
Copy link
Contributor Author

aerubanov commented May 12, 2021

@brandonwillard , I remove sparse argument, but now I have two problems:

  • when I try to pass aesara SparseVariable as W in CAR constructor, I get this error:
     File "<ipython-input-13-b44074052d76>", line 2, in <module>
      car = pm.CAR("car", mu, ws, alpha, tau)
    ..........................
    File "/home/anatoly/HDD/Projects/pymc3/pymc3/distributions/multivariate.py", line 1963, in make_node
      return super().make_node(rng, size, dtype, mu, W, alpha, tau)
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 374, in make_node
      bcast = self.compute_bcast(dist_params, size)
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/configparser.py", line 49, in res
      return f(*args, **kwargs)
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 287, in compute_bcast
      shape = self._infer_shape(size, dist_params)
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 224, in _infer_shape
      params_ind_slice = tuple(
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 225, in <genexpr>
      slice_ind_dims(p, ps, n)
    File "/home/anatoly/anaconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 214, in slice_ind_dims
      for s, b in zip(shape[:-n], p.broadcastable[:-n])
    AttributeError: 'SparseVariable' object has no attribute 'broadcastable'
    
    But if I pass ndarray or TensorVariable all works fine.
  • test for CAR logp method fails for 3D size, because aesara.sparse.dot() accept only 1 or 2 ndim inputs.

What should I do about these errors?

aerubanov and others added 2 commits July 12, 2021 16:57
Co-authored-by: Ricardo Vieira <28983449+ricardoV94@users.noreply.github.com>
@brandonwillard
Copy link
Contributor

@brandonwillard Is it okay to merge with the current .eval() hack or should we wait for an upstream fix / find another alternative if possible?

No, we shouldn't merge evaluations like that.

@aerubanov
Copy link
Contributor Author

@ricardoV94 I rebased on main and I will open issues about .sum() for sparse tensor in aesara and here.

@aerubanov
Copy link
Contributor Author

aerubanov commented Jul 12, 2021

@ricardoV94, I removed .eval() hack. @brandonwillard noticed me in aesara-devs/aesara#522 that sp_sum() fit mach better here.

@aerubanov
Copy link
Contributor Author

@ricardoV94, I changed formatting and add tolerance selection based on the aesara.config.floatX value in test_car_logp. These should fixed failed tests.

@codecov
Copy link

codecov bot commented Jul 13, 2021

Codecov Report

Merging #4596 (c89dad2) into main (a3ee747) will increase coverage by 0.84%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4596      +/-   ##
==========================================
+ Coverage   72.32%   73.16%   +0.84%     
==========================================
  Files          85       86       +1     
  Lines       13884    13838      -46     
==========================================
+ Hits        10042    10125      +83     
+ Misses       3842     3713     -129     
Impacted Files Coverage Δ
pymc3/distributions/multivariate.py 71.45% <100.00%> (+7.61%) ⬆️
pymc3/tests/conftest.py 90.47% <100.00%> (+2.24%) ⬆️
pymc3/distributions/dist_math.py 87.36% <0.00%> (-4.22%) ⬇️
pymc3/math.py 67.85% <0.00%> (-0.49%) ⬇️
pymc3/distributions/discrete.py 98.97% <0.00%> (-0.03%) ⬇️
pymc3/__init__.py 100.00% <0.00%> (ø)
pymc3/sampling_jax.py 0.00% <0.00%> (ø)
pymc3/distributions/__init__.py 100.00% <0.00%> (ø)
pymc3/printing.py 85.85% <0.00%> (ø)
pymc3/sampling.py 85.67% <0.00%> (+0.01%) ⬆️
... and 10 more

@aerubanov
Copy link
Contributor Author

@ricardoV94 , I added check for W symmetry and test for it.

@aerubanov aerubanov requested a review from ricardoV94 July 22, 2021 15:32
W = aesara.sparse.csr_from_dense(W)

car_dist = CAR.dist(mu, W, alpha, tau)
with pytest.raises(AssertionError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you match the error message?

Suggested change
with pytest.raises(AssertionError):
with pytest.raises(AssertionError, match="W must be a symmetric adjacency matrix"):

And add a test for the other error (ndim=2)?

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost there. Just suggested a tweak to the error tests

@aerubanov
Copy link
Contributor Author

@ricardoV94 , I added changes that you suggested.

@ricardoV94
Copy link
Member

CC @ckrapu in case he has the chance to take a look before merging

@ckrapu
Copy link
Contributor

ckrapu commented Jul 26, 2021

Thanks for the ping - I am very excited to see that the fast CAR sampling method is implemented. I often find scenarios where I need to sample draws from this kind of random field even outside of the context of prior predictive sampling.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, trusting the rng_fn algorithm is correctly implemented.

Thanks a lot @aerubanov! This was a tough one, hopefully your next PR will be easier to crack :D

@twiecki twiecki merged commit 819f045 into pymc-devs:main Jul 27, 2021
@twiecki
Copy link
Member

twiecki commented Jul 27, 2021

Thanks @aerubanov, this was indeed a tricky one!

@aerubanov aerubanov deleted the CAR_rand_var branch July 27, 2021 13:58
sthagen added a commit to sthagen/pymc-devs-pymc that referenced this pull request Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Developing random method for conditional autoregression
5 participants