Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common abstractions for nonstructural indicies between Dense and Sparse Matrices #50

Open
oxinabox opened this issue Mar 26, 2018 · 8 comments

Comments

@oxinabox
Copy link
Contributor

indices is our abstraction for getting the indices so one can iterate the index of a matrix.
It works great for dense matrices.

However, for sparse matrices, it is almost never what you want to do.
You instread only want to iterate the indicies of the nonstructural elements.
(because there are less of them)
That is done using rowvals, and nzrange

In a dense matrix one can say that all the elements are nonstructural.
So in a circumstance when you only want to iterate through the nonstructural elements,
you want to iterate through all the elements in a dense matrix.
More generally with a matrix of unknown type, you want to iterate through all the elements.

I thus propose that we should have an abstraction for getting the nonstructural indicies,
which falls back to getting all the indiices,
to make it easier to write code that works efficiently on spare matricies, and also works (as fast as is possible) on other types.

Some thing like

const SparseMatrix = AbstractSparseArray{<:Any,<:Any,2}
colinds(A::AbstractMatrix)  = indices(A,2)
colinds(A::SparseMatrix) = rowvals(A)

rowinds(A::AbstractMatrix, col::Integer) = indices(A,1)
rowinds(A::SparseMatrix, col::Integer) = nzrange(A, col)

(bikeshed on names pending)

There is also a similar relationship between nonzeros (sparse) and vec (dense/fallback).

I was discussing this on slack with @mbauman and @StefanKarpinski the other day,
and wanted to put it on GitHub before it was lost to the ages

@mbauman
Copy link
Contributor

mbauman commented Mar 26, 2018

Somewhat related is JuliaLang/julia#15648, but I like how this turns the problem on its head: instead of trying to wedge sparse matrices into a generic API, we simply "lift" dense array support onto abstract sparse APIs. This seems far more tractable.

Some of this abstract sparse API is already in place for broadcasting; it'd be interesting to see what the performance difference would be were we to emulate sparse APIs for structured and dense matrices instead of converting them to actual AbstractSparseArrays.

@oxinabox
Copy link
Contributor Author

Oh I think this is row I think nzrange doesn't quiet do this?
But the point stands

@oxinabox
Copy link
Contributor Author

I think what I ment to have is:

colinds(A::AbstractMatrix)  = indices(A,2)

rowinds(A::AbstractMatrix, col::Integer) = indices(A,1)
rowinds(A::SparseMatrixCSC, col::Integer) = rowvals(A)[nzrange(A, col)]

@martinholters
Copy link
Contributor

That API seems to be tied pretty closely to CSC layout. How about an each_nonstructural_index (name subject to bikeshedding, of course)? Iterating over its return value would need to produce indices that allow efficient getindex but are still (reasonable efficiently) convertible to LinearIndex/CartesianIndex if they aren't one of those already.

So the following should be reasonably efficient and leave A==B:

B = zeros(size(A))
for i in each_nonstructural_index(A)
    B[LinearIndices(A)[i]] = A[i]
end

@oxinabox
Copy link
Contributor Author

you're right, my proposal would not work for say CRC.
Or any row major format really.
But we do not have such a thing.
So maybe it does not matter.

Take a look at the example of use in the Chinese Whisphers clustering.
Does that translate sanely to something like
colinds, rowinds = zip(each_nonstructural_index(A)...) ?

(I've not thought too hard if it does or not. I think it does not -- at least not that simply)

@Sacha0
Copy link
Member

Sacha0 commented Mar 27, 2018

it'd be interesting to see what the performance difference would be were we to emulate sparse APIs for structured and dense matrices instead of converting them to actual AbstractSparseArrays.

This was my eventual hope with the sparse broadcast infrastructure. Perhaps in a few months... :)

@jmert
Copy link
Contributor

jmert commented Mar 29, 2018

Somewhat related is JuliaLang/julia#15648, but I like how this turns the problem on its head: instead of trying to wedge sparse matrices into a generic API, we simply "lift" dense array support onto abstract sparse APIs. This seems far more tractable.

For what it's worth, I came up with the same idea/need when trying to implement efficient sparse-sparse, dense-sparse, and sparse-dense outer products for my own needs (and tried to make an attempt at something committable in JuliaLang/julia#24980, but that's obviously fallen far behind in all the linear algebra and broadcasting changes since then).

So I guess with respect to

That API seems to be tied pretty closely to CSC layout. How about an each_nonstructural_index (name subject to bikeshedding, of course)?

I'm in favor of having a portable way of doing CSC-like operations on dense arrays given the fact that CSC-aware algorithms can easily consume dense arrays that way for free (and then a more generic sparse infrastructure can be added later).

@oxinabox
Copy link
Contributor Author

I guess we can have

  • Maximally abstract: indicies, and maybe a nonstructual_indicies like @martinholters suggests
  • Moderately abstract Column major methods that I proposed above (works for CSC/Dense)
    - (Possibly could exend those with a flag that makes them work row major? Or just define then names to not say row or column...)
  • Minimally abstract: nzrange, rowvals (only CSC)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants