Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparseMatrixCSC should probably be an immutable #15668

Closed
KristofferC opened this issue Mar 29, 2016 · 2 comments
Closed

SparseMatrixCSC should probably be an immutable #15668

KristofferC opened this issue Mar 29, 2016 · 2 comments
Labels
sparse Sparse arrays

Comments

@KristofferC
Copy link
Member

Currently SparseMatrixCSC is a mutable type (in contrast with SparseVector which is an immutable). This causes a few problems in the code base for sparse matrices.

The main problem is that LLVM is not able to hoist field accesses in loops for types. This is presumably because it is hard to prove that the address to the field stays the same during the loop. Therefore, in order to get good performance, almost all sparse matrix function starts with something similar to:

nzvalA = A.nzval
rowvalA = A.rowval
colptrA = A.colptr
nzvalB = B.nzval
rowvalB = B.rowval
colptrB = B.colptr

This is of course quite ugly, and different authors have different names for the hoisted fields leading to the code base being less friendly to read. It is also easy to forget the manual hoisting, which leads to less than ideal performance. A third problem is that you cannot use convenience functions like nzrange without a performance loss since this will cause extra dereferences of fields. The effect of this is that the sparse code base is pretty much fully inlined with almost no convenience functions being used.

The places where the mutability of the fields is used is in spset!, spdelete!, and three setindex! functions so not that many places. The way it is used in these places are that the "sparsity pattern" is being remade from scratch with new arrays and in the end the original fields of the matrix is simply swapped out for the new arrays. I am quite confident that rewriting these to create the new sparsity pattern in the already existing field is not too difficult.

I currently have a branch were everything builds and tests pass with SparseMatrixCSC being an immutable but the way I did it causes an extra copy in the functions mentioned previously. If I were to rewrite it to have the same performance as now, would a PR making SparseMatrixCSC immutable be desired? A follow up PR or a commit in the same one could look at removing all the manually hoisted fields and benchmarking to see that nothing gets slower.

@timholy
Copy link
Member

timholy commented Mar 29, 2016

I looked at this briefly too---it seems a necessary precondition for developing efficient custom iterators for sparse matrices.

I also agree with your analysis of where the trouble lies, and personally think it's worth it even if we do see a small hit for performance. Since I think those are all called from setindex!, which can't be all that fast anyway, I suspect there's very little downside.

Thanks for tackling this!

@simonster
Copy link
Member

See also #9755. While I have no objections to making SparseMatrixCSC immutable, I'd still eventually like to fix the more general issue, since it has affected my code in places where I can't conveniently make the type immutable. It would kind of surprise me if the necessary TBAA constraint (pointer fields/elements can't alias non-pointer fields/elements) affects any code at all, and slowing down common access patterns to allow this kind of aliasing it doesn't seem worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

No branches or pull requests

4 participants