Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse matrices? #29

Closed
ivirshup opened this issue Nov 4, 2019 · 18 comments
Closed

Sparse matrices? #29

ivirshup opened this issue Nov 4, 2019 · 18 comments
Labels
question Notes an issue as a question

Comments

@ivirshup
Copy link

ivirshup commented Nov 4, 2019

@stuartarchibald, I saw on the numba gitter you were working on a scipy.sparse implementation here. I would really like to be able to use sparse matrices in compiled code, and have been implementing a bit of this myself, though primarily aiming at indexing into out-of-core sparse matrices. pydata/sparse has looked like an interesting target for this, but is missing the CSC and CSR formats.

I'd be keen to hear your perspective. What are your plans for sparse matrices here? How do you see this fitting in with pydata/sparse?

@esc esc added the question Notes an issue as a question label Nov 4, 2019
@stuartarchibald
Copy link
Contributor

@ivirshup I would hope to implement what's in scipy.sparse but would want to focus on the compressed formats first (CSR, CSC) as they are the most commonly consumed formats by high performance libraries/are more prevalent in high performance numerical code (in my experience). In what I've put together so far I'm aiming for jit-transparency as per Numba, i.e. with or without the @jit decorator a function just works, data types are the same etc. As the SciPy internals for these layouts are NumPy array based the compute cost of translation to native code seems pretty minimal.

I suspect that some work might need doing in SciPy/scikits to expose bindings to high performance libraries.

RE pydata/sparse, we'll see how the above goes first?

@SanPen
Copy link

SanPen commented Nov 8, 2019

Hi @ivirshup and @stuartarchibald

A couple of months ago I started a numba-based sparse matrix library (CSparse3) which lead me to find this project.

Indeed the access to sparse solvers from scipy would be very nice. I don't know if there is any synergy here.

BR,
Santiago

@ivirshup
Copy link
Author

@stuartarchibald definitely agree that CSR and CSC formats seem like the most valueable, since it's what you need to work with outside libraries. It would definitely be cool to be able to use GraphBLAS with numba defined types and methods.

My main concern with copying the scipy.sparse matrices' interface is the use of the deprecated np.matrix interface. If you're going for jit-transparency, I'd assume you're implementing the matrix interface here?

@SanPen, you might be interested to look at this issue and corresponding PR about making the scipy.sparse solvers available for pydata/sparse arrays: pydata/sparse#293 scipy/scipy#10901

@ivirshup
Copy link
Author

On the topic of array interfaces and CSC/ CSR matrices, there's a discussion here about (partially) providing an array interface to SciPy sparse arrays: scverse/scanpy#921

@brocksam
Copy link

What's the current state-of-play with numba-scipy for scipy.sparse? I have a number of projects that I would love to be able to be able to use Numba with scipy.sparse to speed up computations. I'm keen to get involved with this channel of work if I can be of assistance. I haven't contributed to Numba before but have reasonable experience with C/C++/Cython/SciPy so can hopefully contribute usefully in some way.

@stuartarchibald you're probably best placed to advise. Where are we up to with this so far? Is there a roadmap for what needs to be done? How can I be of assistance?

@PercyLau
Copy link

Hi, @stuartarchibald @ivirshup @SanPen @brocksam

Supporting sparse matrix in numba is an important feature, especially for easily speeding up big-data-set processes that require a lot of memory. So, is there any further discussion or progress of this issue? I'm also keen to get involved with this channel of work if I can be of assistance. Many projects need this feature to accelerate computations.

@esc
Copy link
Member

esc commented Oct 16, 2020

@PercyLau I am now aware of any discussions regarding the implementation of sparse-matrices in Numba. Perhaps it would make sense to trawl through our discourse: https://numba.discourse.group/ -- and perhaps start a discussion there if none exists?

@PercyLau
Copy link

PercyLau commented Oct 19, 2020

@PercyLau I am now aware of any discussions regarding the implementation of sparse-matrices in Numba. Perhaps it would make sense to trawl through our discourse: https://numba.discourse.group/ -- and perhaps start a discussion there if none exists?

Of course add sparse matrix Very weird this issue was discussed in github and stackoverflow many times but none exists in numba.discourse.group.

@PercyLau
Copy link

@PercyLau I am now aware of any discussions regarding the implementation of sparse-matrices in Numba. Perhaps it would make sense to trawl through our discourse: https://numba.discourse.group/ -- and perhaps start a discussion there if none exists?

@esc Perhaps, this could be helpful https://lkpy.readthedocs.io/en/stable/matrix.html#compressed-sparse-row-matrices, where these people implement numba-based sparse matrix. However, a more native version is necessary since sparse matrix is very common, e.g., someone may want to include a sparse matrix into a njit function, for which based on customized implementation is not possible.

@esc
Copy link
Member

esc commented Oct 19, 2020

@PercyLau perhaps the following can help with bringing a third-party extension closer to Numba:

https://numba.pydata.org/numba-doc/dev/extending/entrypoints.html

@esc
Copy link
Member

esc commented Oct 19, 2020

@PercyLau and there is also:

https://sparse.pydata.org/en/stable/index.html

Which, I believe, uses Numba under the hood.

@PercyLau
Copy link

PercyLau commented Oct 20, 2020

@PercyLau and there is also:

https://sparse.pydata.org/en/stable/index.html

Which, I believe, uses Numba under the hood.

@esc. However, the most of the projects lacks of CSR/CSC sparse matrix. I think it is because numba somehow has no convenient sorting function to do lexsort which is inevident to convert a COO to a CSR/CSC.

@mdekstrand
Copy link

Hi! I'm the author of LensKit, and am also looking at the state of Numba and sparse matrices more deeply, because I don't think the LensKit source code is a good place for our matrix utilities to live long-term (and indeed, if we can replace our sparse matrix, LensKit itself can be a pure-Python package).

I'm currently thinking about spinning our class out into a standalone package focused on CSR/CSC matrix support. I'd like to clean up our design a bit (right now it is admittedly pretty weird, in part because I haven't done anything to use Numba's lowering to hide the Python/Native matrix class distinction), and also transparently support MKL acceleration (when available) and either pure Numba operations or bindings to another sparse package that can be bundled with it. For my use case in LensKit, when MKL is available, it gives us a significant performance boost, but I would like to keep that transparent if possible. Right now one of my blockers is just figuring out how to ergonomically connect Numba and MKL's inspector-executor framework in a way that is also maximally memory-safe. jitclass's lack of support for __del__ is making that more difficult than I would like.

@Filco306
Copy link

Is there any update on this issue? I would love sparse matrix support in numba. :)

@esc
Copy link
Member

esc commented Aug 30, 2021

Is there any update on this issue? I would love sparse matrix support in numba. :)

Thank you for asking. Not at this time. Continue to watch this issue for updates and thank you for using Numba!

@mdekstrand
Copy link

Just FYI, I have spun the sparse matrix code above into a separate package: https://csr.lenskit.org

@Filco306
Copy link

I will try that @mdekstrand and reach out to you if I encounter any issues :D

@esc
Copy link
Member

esc commented Apr 9, 2024

with #92 merged this can now be closed.

@esc esc closed this as completed Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Notes an issue as a question
Projects
None yet
Development

No branches or pull requests

8 participants