Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize linalg.cholesky beyond 2D arrays #791

Open
Tracked by #1424
ricardoV94 opened this issue Jan 25, 2022 · 6 comments
Open
Tracked by #1424

Generalize linalg.cholesky beyond 2D arrays #791

ricardoV94 opened this issue Jan 25, 2022 · 6 comments
Labels
enhancement New feature or request help wanted Extra attention is needed NumPy compatibility

Comments

@ricardoV94
Copy link
Contributor

ricardoV94 commented Jan 25, 2022

import numpy as np
import aesara.tensor as at

x = np.full((2, 3, 3), np.eye(3))
np.linalg.cholesky(x)  # broadcast operation fine
at.linalg.cholesky(x)  # AssertionError x.ndim == 2
@brandonwillard brandonwillard added the help wanted Extra attention is needed label Jan 25, 2022
@Sayam753
Copy link

Sayam753 commented Feb 2, 2022

Hi
I have started to look into this issue to make progress on pymc-devs/pymc#5424. The source code of Cholesky Op defines a L_Op method:

def L_op(self, inputs, outputs, gradients):

To generalize linalg.cholesky beyond 2d arrays, do we need to update this method as well alongside changing perform method?

@brandonwillard
Copy link
Member

To generalize linalg.cholesky beyond 2d arrays, do we need to update this method as well alongside changing perform method?

Yes, as well as Op.infer_shape and everything else. It's always possible to raise exceptions in those cases, but that would make the Op/new capability much less useful.

@lucianopaz
Copy link
Contributor

@Sayam753, did you start working on this? I'd like to try doing it if not.

@Sayam753
Copy link

@lucianopaz no, feel free to work on the issue. I am still wrapping my head around how the grad operations in Op work.

@lucianopaz
Copy link
Contributor

I've got a working Cholesky Op, but I'm kind of struggling with the gradient. Aesara's scipy wrapper around Cholesky uses solve_upper_triangular, which expects 2-D arrays. Scipy, doesn't seem to have anything that we can use to solve a linear equation with a batch of triangular matrices. Numpy, has solve that should be slower and maybe less stable than an algorithm that exploits the triangular nature of the matrix, but it can handle batches of matrices. The thing is that aesara does not have an Op that wraps numpy solve. Should I open an issue to add one? I feel it shouldn't be done on a PR that is trying to get cholesky to work.

As a side note, there are also a bunch of functions tril, diag and diagonal that expected matrices, but it was easy to code up extensions that could accommodate batches of matrices simply using advanced indexing. Should those be made available in nlinalg module as well as cholesky?

@ricardoV94
Copy link
Contributor Author

ricardoV94 commented Feb 10, 2022

@lucianopaz The perform method of the Op is free to do whatever it wants, including Python loops across batches, or just call a np.vectorized version of the scipy solve if we have reason to believe that one is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed NumPy compatibility
Projects
None yet
Development

No branches or pull requests

4 participants