fixes #8416 (and duplicated issue #9664) #10106

amartgon · 2015-02-06T17:44:52Z

I believe this fixes the problem with Vector .* Sparse and Sparse .* Vector (#8416 and #9664)

tkelman · 2015-02-06T22:31:56Z

This looks like it could substantially slow down elementwise multiplication of sparse matrices with the same sizes, since broadcast! is not at all optimized for sparse arrays.

edit:

julia> A = sprand(1000,1000,0.3);
julia> B = sprand(1000,1000,0.3);
julia> @time A.*B;
elapsed time: 0.004140143 seconds (9 MB allocated)
julia> @time A.*B;
elapsed time: 0.007408249 seconds (9 MB allocated, 27.46% gc time in 1 pauses with 0 full sweep)
julia> @time A.*B;
elapsed time: 0.004388563 seconds (9 MB allocated)
julia> @time broadcast!(*, spzeros(Float64, Int, 1000, 1000), A, B);
elapsed time: 0.197608866 seconds (4 MB allocated)
julia> @time broadcast!(*, spzeros(Float64, Int, 1000, 1000), A, B);
elapsed time: 0.193278318 seconds (4 MB allocated)
julia> @time broadcast!(*, spzeros(Float64, Int, 1000, 1000), A, B);
elapsed time: 0.193749637 seconds (4 MB allocated)

tkelman · 2015-02-06T22:33:40Z

A better solution to this is to stop using 1-column matrices as sparse vectors. There is a PR open for a sparse vector type, might be good to just get that into shape.

quinnj · 2015-02-06T22:35:17Z

Ah yes, I forgot I had got the sparse vector work started. I'll try to pick
that back up to push it along.

On Fri, Feb 6, 2015 at 3:33 PM, Tony Kelman notifications@github.com
wrote:

A better solution to this is to stop using 1-column matrices as sparse
vectors. There is a PR open for a sparse vector type, might be good to just
get that into shape.

—
Reply to this email directly or view it on GitHub
#10106 (comment).

simonster · 2015-02-06T23:46:55Z

Shouldn't we make broadcasting work (efficiently) for SparseMatrices whether or not we have sparse vectors?

tkelman · 2015-02-06T23:56:15Z

Yeah, that'd be great. Any proposals for how to go about doing that? This doesn't apply just to sparse, but all of the structured matrix types - and not just to broadcast, but all of the reductions and other "generic" operations throughout base. There isn't currently a great way of conveying or utilizing structural information in an extensible, truly generic way that works for anything that isn't dense. But now I'm repeating myself from JuliaLang/LinearAlgebra.jl#136. In the meantime, one method at a time, one type at a time.

simonster · 2015-02-07T03:02:44Z

For the case where one input is a SparseMatrix and one is a dense column or row vector, we can just call scale, which already exists for SparseMatrixCSC.

For the case where both the inputs are SparseMatrices, one of which is a single row/column, and the output is a SparseMatrix, I'm not sure there is a sufficiently efficient general-purpose way to handle things, and we should probably just implement the methods.

As far as a general purpose way to iterate for reductions etc., we could have something like an iterator version of findnz which falls back to @timholy's Cartesian product iterator for the dense case, although I believe there's still some overhead to the Cartesian product iterator and you're right that we may need to think harder about how to handle row-major arrays.

ViralBShah · 2015-02-07T03:58:28Z

I think generic is the harder part, but we should at least make broadcast make with the current sparse data structures.

amartgon · 2015-02-09T15:27:44Z

Things are never as easy as they look at first sight.. Thanks for the performance tests and all the comments: really useful for me to better understand the problem.

I would like to work on making broadcast efficient with SparseMatrixCSC. This first version would work only with 2D, so Cartesian indexing would not be used. Most of the code for efficient operations is already there. It would just be necessary to call it from a broadcast! method and ensure that sparse matrix operations use it consistently. I also think this is not incompatible with introducing new sparse types in the future.

Does it make sense?

tkelman · 2015-02-10T06:42:56Z

Does it make sense?

Yes, if you're interested in taking a crack at it please go ahead.

For sparse broadcast, we may need a new classification to more clearly differentiate operators that produce a nonzero when either of their inputs are nonzero, vs operators that return a nonzero only when both of their inputs are nonzero.

JeffBezanson · 2015-03-08T04:44:06Z

What's the status of this?

tkelman · 2015-03-08T19:11:16Z

Sparse broadcast needs design work. Adding proper SparseVectors has an open PR, but with conflicts and in need of a rebase.

amartgon · 2015-03-09T22:11:16Z

Regarding only broadcast of SparseMatrixCSC: I have been working on it and advancing, though not as fast as I'd like. I think I can have a proposal implemented by the end of this week.

It would include a new broadcast method for the case of two sparse operands (even though the general broadcast methods take n operands). And another method, which could be called broadcast_zpreserving, for binary operations that have zero result when any of the two operands is 0 (such as .*). This broadcast_zpreserving would take one sparse operand and other that would be any type of array or constant.

This would reflect the distinction between types of operators mentioned by tkelman:

For sparse broadcast, we may need a new classification to more clearly differentiate operators that produce a nonzero when either of their inputs are nonzero, vs operators that return a nonzero only when both of their inputs are nonzero.

tkelman · 2015-03-11T05:47:36Z

Sounds like an interesting approach, I look forward to seeing what you can come up with @amartgon.

amartgon · 2015-03-14T23:24:25Z

Here is the proposal. I think it can be a starting point to be generalized in the future for other sparse structures. Any comments are very welcome.

A few simple performance tests are included bellow. Methods with the suffix "_old" are the old implementation of the operations, which I renamed this way before deleting them. First calls to methods so that they get compiled are omitted.

My conclusion: most operations have a small overhead, but some cases such as sparse .* dense are much faster. Other cases such as sparse(n x m) .* sparse(n x 1) did not work at all before.

julia> A = sprand(10000,10000,0.3); B = sprand(10000,10000,0.3); BF = full(B);

julia>  @time A.+B; #new
elapsed time: 0.976550561 seconds (915 MB allocated, 0.64% gc time in 1 pauses with 1 full sweep)

julia>  @time A.+B; #new
elapsed time: 1.031963776 seconds (915 MB allocated, 0.61% gc time in 1 pauses with 1 full sweep)

julia>  @time A.+B; #new
elapsed time: 1.058726632 seconds (915 MB allocated, 9.80% gc time in 1 pauses with 0 full sweep)

julia>  @time A+B; #old
elapsed time: 1.018026417 seconds (915 MB allocated, 12.78% gc time in 2 pauses with 1 full sweep)

julia>  @time A+B; #old
elapsed time: 0.990413792 seconds (915 MB allocated, 10.85% gc time in 2 pauses with 1 full sweep)

julia>  @time A+B; #old
elapsed time: 0.961619903 seconds (915 MB allocated, 11.15% gc time in 2 pauses with 1 full sweep)

julia>

julia>

djulia> @time A.*B; #new
elapsed time: 0.689984741 seconds (915 MB allocated, 0.86% gc time in 1 pauses with 1 full sweep)

julia> @time A.*B; #new
elapsed time: 0.688215981 seconds (915 MB allocated, 0.86% gc time in 1 pauses with 1 full sweep)

julia> @time A.*B; #new
elapsed time: 0.699050298 seconds (915 MB allocated, 1.01% gc time in 1 pauses with 1 full sweep)

julia> @time Base.SparseMatrix.times_old(A,B);  #old
elapsed time: 0.575538209 seconds (915 MB allocated, 7.48% gc time in 2 pauses with 1 full sweep)

julia> @time Base.SparseMatrix.times_old(A,B);  #old
elapsed time: 0.574641348 seconds (915 MB allocated, 7.61% gc time in 2 pauses with 1 full sweep)

julia> @time Base.SparseMatrix.times_old(A,B);  #old
elapsed time: 0.584391185 seconds (915 MB allocated, 7.42% gc time in 2 pauses with 1 full sweep)

julia>

julia>

julia> @time A.*BF; #new
elapsed time: 0.472082517 seconds (457 MB allocated, 1.39% gc time in 1 pauses with 1 full sweep)

julia> @time A.*BF; #new
elapsed time: 0.469774405 seconds (457 MB allocated, 1.42% gc time in 1 pauses with 1 full sweep)

julia> @time A.*BF; #new
elapsed time: 0.480360273 seconds (457 MB allocated, 1.16% gc time in 1 pauses with 1 full sweep)

julia> @time Base.SparseMatrix.times_old(A,BF);  #old
elapsed time: 2.437033787 seconds (1602 MB allocated, 4.56% gc time in 5 pauses with 4 full sweep)

julia> @time Base.SparseMatrix.times_old(A,BF);  #old
elapsed time: 2.500815834 seconds (1602 MB allocated, 5.77% gc time in 4 pauses with 3 full sweep)

julia> @time Base.SparseMatrix.times_old(A,BF);  #old
elapsed time: 2.464224123 seconds (1602 MB allocated, 5.73% gc time in 4 pauses with 3 full sweep)

tkelman · 2015-03-15T09:31:04Z

base/broadcast.jl

@@ -98,6 +99,207 @@ function gen_broadcast_body_iter(nd::Int, narrays::Int, f::Function)
    end
 end

+## Broadcasting cores specialized for returning a SparseMatrixCSC


More common terminology here would be "kernels" rather than "cores"

Yes, I agree. I'll correct it. Thanks!

tkelman · 2015-03-15T09:43:39Z

Overall this looks pretty good, the sparse .* dense improvement is especially nice. Have you profiled to see where the ~20% difference in sparse .* sparse is coming from? Would be ideal to close that gap, or failing that, leave the existing implementation in place for now. I like the improved generality here, but performance regressions are annoying if we can avoid them.

amartgon · 2015-03-17T16:43:41Z

About the ~20% difference in sparse .* sparse:

There was also ~10% difference in sparse .+ sparse, although in the particular runs I posted it was not clear.
After further testing (and taking into account that the new algorithm is basically the same as the old), It turns out that the differences come from storing the array fields in local variables:

colptrA = A.colptr; rowvalA = A.rowval; nzvalA = A.nzval
colptrB = B.colptr; rowvalB = B.rowval; nzvalB = B.nzval

I thought it was just a matter of style and omitted it when adapting the code. After changing it, the old and the new version seem to have undistinguishable running times. I'll submit a corrected version.

tkelman · 2015-03-17T17:06:21Z

Yeah, unfortunately manually hoisting field access into temporary variables does make a difference right now. Eventually Julia's compiler might get smart enough (edit: or aliasing assumptions in various types get adjusted) to do that kind of thing automatically, but for now it's a little bit of manual code ugliness in exchange for better performance. I should've mentioned this in some code comments, at least you were able to identify it yourself.

Also have a look at #10536 which is somewhat related (sparse reductions rather than broadcast), be aware that the bootstrapping order changes here and there might result in some conflicts that would need to be rebased depending which gets merged first.

simonster · 2015-03-17T17:12:39Z

Linking #9755 for the hoisting issue.

As far as bootstrapping order goes, if we are eventually planning to move sparse matrices to a separate package (as advocated by @tknopp), we should probably try to put all of the sparse matrix-related code in the SparseMatrix module and load it at the end.

ViralBShah · 2015-03-17T19:53:16Z

I don't think it is realistic to move sparse support out of base. What is likely is to have new sparse data structure implementations in packages and make it easy to do so.

tkelman · 2015-03-18T11:33:09Z

Sparse matrices are one of those things like bigints/bigfloats, FFT's, and hell even blas and lapack, that shouldn't really be required runtime dependencies for every piece of code written in the language - if you're not using the functionality then these aren't absolutely necessary for the rest of the language to function. This is 1.0-type stuff, but eventually there should be ways of selectively disabling big chunks of what's currently in base, for embedding and static compilation use cases. The shorter-term (still probably not until 0.5 or later I'm guessing) default situation is we'll probably want to take SuiteSparse out into a package for licensing reasons, but base will need to provide a scaffolding for different packages to be able to plug in various different sparse formats and solver libraries.

amartgon · 2015-03-20T14:51:50Z

This last commit includes the changes we've been commenting on, and also a reorganization of the new code (moving it from broadcast.jl into sparsematrix.jl) so that no modification in the bootstrap process (sysimg.jl) is needed. If there are no new comments or detected problems, this could be the final version.

tkelman · 2015-03-20T15:03:31Z

base/sparse/sparsematrix.jl

+
+## element-wise comparison operators returning SparseMatrixCSC ##
+.<{Tv1,Ti1,Tv2,Ti2}(A_1::SparseMatrixCSC{Tv1,Ti1}, A_2::SparseMatrixCSC{Tv2,Ti2}) = broadcast!(<, spzeros( Bool, promote_type(Ti1, Ti2), broadcast_shape(A_1, A_2)...), A_1, A_2)
+.!={Tv1,Ti1,Tv2,Ti2}(A_1::SparseMatrixCSC{Tv1,Ti1}, A_2::SparseMatrixCSC{Tv2,Ti2}) = broadcast!(!=, spzeros( Bool, promote_type(Ti1, Ti2), broadcast_shape(A_1, A_2)...), A_1, A_2)


~~!= returns true on entries that are zero in both inputs, so this generally won't return sparse~~ nevermind

Not sure I understand... wouldn't that be .==?

whoops, too early in the morning, you're right

tkelman · 2015-03-20T15:04:20Z

Looks pretty good, thanks for rebasing. It will fail on appveyor due to a hasty broken-on-windows merge yesterday, sorry.

I don't think broadcast_zpreserving should be exported.

amartgon · 2015-03-23T11:09:22Z

Here is a version where broadcast_zpreserving is not exported.

tkelman · 2015-03-23T15:56:03Z

I like it. I think it would be worth squashing these commits.

@ViralBShah @simonster any feedback?

amartgon · 2015-04-13T14:34:29Z

Any news about this? (Just a friendly reminder).

ViralBShah · 2015-04-13T14:52:58Z

Thanks for the heads up. I really would prefer to see all the broadcasting code go into sparsematrix.jl. Is there a reason to still have some code in broadcast.jl?

Also, can you squash these commits too and rebase to the latest master. I do want to get it merged. Thanks for persisting.

ViralBShah · 2015-04-13T14:54:18Z

@tanmaykm Could you please review this PR too?

amartgon · 2015-04-14T11:47:55Z

Hi!
Code moved into sparsematrix.jl, commits squashed and rebased to latest master...

ViralBShah · 2015-04-16T09:13:59Z

@amartgon Thanks for this work and for being patient.

amartgon · 2015-04-16T09:16:35Z

Great! Thanks to all who provided helpful comments!

mbauman added the sparse Sparse arrays label Feb 6, 2015

tkelman reviewed Mar 15, 2015
View reviewed changes

tkelman reviewed Mar 20, 2015
View reviewed changes

fixes #8416 (and duplicated issue JuliaLang#9664)

df141bc

amartgon force-pushed the issues_#9664_#8416 branch from e6d295d to df141bc Compare April 14, 2015 11:41

ViralBShah merged commit df141bc into JuliaLang:master Apr 16, 2015

toivoh mentioned this pull request Apr 16, 2015

Elementwise operations .* and ./ are not broadcasting for all argument types #3171

Closed

5 tasks

This was referenced May 29, 2015

.* not defined for sparse matrices #11475

Closed

broadcast makes sparse matrix full #11474

Closed

tkelman mentioned this pull request Sep 1, 2015

Surprising fallback method for .* in sparse/sparsematrix.jl #12894

Closed

fixes #8416 (and duplicated issue #9664) #10106

fixes #8416 (and duplicated issue #9664) #10106

Conversation

amartgon commented Feb 6, 2015

tkelman commented Feb 6, 2015

tkelman commented Feb 6, 2015

quinnj commented Feb 6, 2015

simonster commented Feb 6, 2015

tkelman commented Feb 6, 2015

simonster commented Feb 7, 2015

ViralBShah commented Feb 7, 2015

amartgon commented Feb 9, 2015

tkelman commented Feb 10, 2015

JeffBezanson commented Mar 8, 2015

tkelman commented Mar 8, 2015

amartgon commented Mar 9, 2015

tkelman commented Mar 11, 2015

amartgon commented Mar 14, 2015

tkelman Mar 15, 2015

Choose a reason for hiding this comment

amartgon Mar 16, 2015

Choose a reason for hiding this comment

tkelman commented Mar 15, 2015

amartgon commented Mar 17, 2015

tkelman commented Mar 17, 2015

simonster commented Mar 17, 2015

ViralBShah commented Mar 17, 2015

tkelman commented Mar 18, 2015

amartgon commented Mar 20, 2015

tkelman Mar 20, 2015

Choose a reason for hiding this comment

amartgon Mar 20, 2015

Choose a reason for hiding this comment

tkelman Mar 20, 2015

Choose a reason for hiding this comment

tkelman commented Mar 20, 2015

amartgon commented Mar 23, 2015

tkelman commented Mar 23, 2015

amartgon commented Apr 13, 2015

ViralBShah commented Apr 13, 2015

ViralBShah commented Apr 13, 2015

amartgon commented Apr 14, 2015

ViralBShah commented Apr 16, 2015

amartgon commented Apr 16, 2015