Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Op implementations in Numba #658

Open
brandonwillard opened this issue Nov 9, 2021 · 4 comments
Open

Sparse Op implementations in Numba #658

brandonwillard opened this issue Nov 9, 2021 · 4 comments
Labels
help wanted Extra attention is needed important Numba Involves Numba transpilation sparse tensors

Comments

@brandonwillard
Copy link
Member

We need to determine how we're going to support the sparse matrix Ops in Numba.

Our sparse matrix Ops are basically just interfaces to scipy.sparse, and Numba/numba-scipy doesn't already support scipy.sparse, so we would need to create our own means of accessing scipy.sparse at the C level and/or implement the appropriate functions/objects in pure Numba.

The former (i.e. C) approach might be easier to implement in the short-term, and it could involve the same mechanisms as numba/numba#7524. Ideally, this work would be done in numba-scipy and used by Aesara, but we could consider transitioning the code from Aesara to numba-scipy if that helps us replace the C backend sooner.

@brandonwillard brandonwillard added help wanted Extra attention is needed important Numba Involves Numba transpilation labels Nov 9, 2021
@brandonwillard brandonwillard added this to the Replace C Backend milestone Nov 9, 2021
@brandonwillard brandonwillard pinned this issue Dec 11, 2021
@kc611
Copy link
Contributor

kc611 commented Jan 5, 2022

During numba/numba#7524 a conclusion was drawn that the C-approach will be (performance-wise) much similar to a simple call to Numba's object_mode for a similar operation. If that's the case would the latter i.e. implement the appropriate functions/objects in pure Numba would be more appropriate ?

@brandonwillard
Copy link
Member Author

Yes, we can always use object mode for this, and we might need to do that in order to replace the C backend; however, we don't want to lose any performance gains we might currently be getting from the sparse COps.

It looks like some Numba-compatible sparse matrix code is available, as mentioned in the main Numba + sparse matrix issue here.

@kc611
Copy link
Contributor

kc611 commented Jan 6, 2022

Seems like there's an Numba compatible implementation at https://github.com/lenskit/csr. It has the basic operations coded as Numba functions, we could use those to build our own versions. It seems to be limited to CSR matrices but I guess similar lowering and architectural logic can be extended to CSC and BSR.

Preferably I'd say we should try lowering the SparseType with logic similar to

https://github.com/lenskit/csr/blob/03fde2d8c3cb7eb330028f34765ff2a06f849631/csr/_struct.py#L10-L28

And shouldn't the CSM and CSR be a Type subclass of SparseType rather than an Op ?

class CSM(Op):
# See doc in instance of this Op or function after this class definition.
"""
Indexing to specified what part of the data parameter
should be used to construct the sparse matrix.

The only reason I see they were made an Op is because of the computation required to convert a TensorType or normal arrays to 'CSM' SparseTypes or Scipy sparse arrays and it's vice versa which is represented by grad method in an Op. But this requires more logic to keep track of its properties like

# don't make this a function or it breaks some optimizations below
csm_properties = CSMProperties()
"""
Extract all of .data, .indices, .indptr and .shape field.
For specific field, `csm_data`, `csm_indices`, `csm_indptr`
and `csm_shape` are provided.
Parameters
----------
csm
Sparse matrix in CSR or CSC format.
Returns
(data, indices, indptr, shape), the properties of `csm`.
Notes
-----
The grad implemented is regular, i.e. not structured.
`infer_shape` method is not available for this op.
"""

We could just as easily make an Op specially for these conversions, which along with a Type for each SparseType will prevent things like above, by simply having those properties as an attribute for the Type. This will also eliminate the need for having a format tag set aside separately for tracking the 'type' of the SparseType

csc_dmatrix = SparseType(format="csc", dtype="float64")
csr_dmatrix = SparseType(format="csr", dtype="float64")
bsr_dmatrix = SparseType(format="bsr", dtype="float64")
csc_fmatrix = SparseType(format="csc", dtype="float32")
csr_fmatrix = SparseType(format="csr", dtype="float32")
bsr_fmatrix = SparseType(format="bsr", dtype="float32")

This will help us with Type issues we see in other classes as well, for instance, #712

@brandonwillard
Copy link
Member Author

brandonwillard commented Jan 6, 2022

There's a lot of refactoring needed in that sparse matrix code; some of it was started in #303, but that's definitely not all of it.

And shouldn't the CSM and CSR be a Type subclass of SparseType rather than an Op ?

Those Ops are serving as general sparse matrix constructors/conversion functions, so they're necessary as Ops.

The thing to consider when it comes to Types is what information is assumed to be static/constant, and the inputs to CSM and CSR include things that are necessarily not static/constant (e.g. the data).

Sub-Types could always be made for CSM and CSR matrices, but they might only serve to more conveniently identify which is which during static analysis (e.g. one would no longer need to check x.owner.op to see if x is a CSM matrix).

CSMProperties is just a single Op that summarizes some run-time property retrieval operations on sparse matrices. When the sparse matrix being operated upon is the output of a CSM/CSR Op, those properties shouldn't need to be computed—as they would with the CSMProperties Op—because they're the inputs to the parent CSM/CSR Op. The idea is that a rewrite would remove this inefficiency (I'm not sure if there is one, though).

Assuming sparse matrices are only ever the output of a CSM/CSR Op (or a similar Op that has the exact same type of inputs), those rewrites do effectively make CSMProperties a weird redundancy; however, I'm not entirely sure that this assumption is always true, and, if it's not, CSMProperties is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed important Numba Involves Numba transpilation sparse tensors
Projects
Status: Backends
Development

No branches or pull requests

2 participants