- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes #8416 (and duplicated issue #9664) #10106
Conversation
This looks like it could substantially slow down elementwise multiplication of sparse matrices with the same sizes, since edit:
|
A better solution to this is to stop using 1-column matrices as sparse vectors. There is a PR open for a sparse vector type, might be good to just get that into shape. |
Ah yes, I forgot I had got the sparse vector work started. I'll try to pick On Fri, Feb 6, 2015 at 3:33 PM, Tony Kelman notifications@github.com
|
Shouldn't we make broadcasting work (efficiently) for SparseMatrices whether or not we have sparse vectors? |
Yeah, that'd be great. Any proposals for how to go about doing that? This doesn't apply just to sparse, but all of the structured matrix types - and not just to broadcast, but all of the reductions and other "generic" operations throughout base. There isn't currently a great way of conveying or utilizing structural information in an extensible, truly generic way that works for anything that isn't dense. But now I'm repeating myself from JuliaLang/LinearAlgebra.jl#136. In the meantime, one method at a time, one type at a time. |
For the case where one input is a SparseMatrix and one is a dense column or row vector, we can just call For the case where both the inputs are SparseMatrices, one of which is a single row/column, and the output is a SparseMatrix, I'm not sure there is a sufficiently efficient general-purpose way to handle things, and we should probably just implement the methods. As far as a general purpose way to iterate for reductions etc., we could have something like an iterator version of |
I think generic is the harder part, but we should at least make broadcast make with the current sparse data structures. |
Things are never as easy as they look at first sight.. Thanks for the performance tests and all the comments: really useful for me to better understand the problem. I would like to work on making broadcast efficient with SparseMatrixCSC. This first version would work only with 2D, so Cartesian indexing would not be used. Most of the code for efficient operations is already there. It would just be necessary to call it from a broadcast! method and ensure that sparse matrix operations use it consistently. I also think this is not incompatible with introducing new sparse types in the future. Does it make sense? |
Yes, if you're interested in taking a crack at it please go ahead. For sparse broadcast, we may need a new classification to more clearly differentiate operators that produce a nonzero when either of their inputs are nonzero, vs operators that return a nonzero only when both of their inputs are nonzero. |
What's the status of this? |
Sparse broadcast needs design work. Adding proper SparseVectors has an open PR, but with conflicts and in need of a rebase. |
Regarding only broadcast of SparseMatrixCSC: I have been working on it and advancing, though not as fast as I'd like. I think I can have a proposal implemented by the end of this week. It would include a new This would reflect the distinction between types of operators mentioned by tkelman:
|
Sounds like an interesting approach, I look forward to seeing what you can come up with @amartgon. |
Here is the proposal. I think it can be a starting point to be generalized in the future for other sparse structures. Any comments are very welcome. A few simple performance tests are included bellow. Methods with the suffix "_old" are the old implementation of the operations, which I renamed this way before deleting them. First calls to methods so that they get compiled are omitted. My conclusion: most operations have a small overhead, but some cases such as julia> A = sprand(10000,10000,0.3); B = sprand(10000,10000,0.3); BF = full(B);
julia> @time A.+B; #new
elapsed time: 0.976550561 seconds (915 MB allocated, 0.64% gc time in 1 pauses with 1 full sweep)
julia> @time A.+B; #new
elapsed time: 1.031963776 seconds (915 MB allocated, 0.61% gc time in 1 pauses with 1 full sweep)
julia> @time A.+B; #new
elapsed time: 1.058726632 seconds (915 MB allocated, 9.80% gc time in 1 pauses with 0 full sweep)
julia> @time A+B; #old
elapsed time: 1.018026417 seconds (915 MB allocated, 12.78% gc time in 2 pauses with 1 full sweep)
julia> @time A+B; #old
elapsed time: 0.990413792 seconds (915 MB allocated, 10.85% gc time in 2 pauses with 1 full sweep)
julia> @time A+B; #old
elapsed time: 0.961619903 seconds (915 MB allocated, 11.15% gc time in 2 pauses with 1 full sweep)
julia>
julia>
djulia> @time A.*B; #new
elapsed time: 0.689984741 seconds (915 MB allocated, 0.86% gc time in 1 pauses with 1 full sweep)
julia> @time A.*B; #new
elapsed time: 0.688215981 seconds (915 MB allocated, 0.86% gc time in 1 pauses with 1 full sweep)
julia> @time A.*B; #new
elapsed time: 0.699050298 seconds (915 MB allocated, 1.01% gc time in 1 pauses with 1 full sweep)
julia> @time Base.SparseMatrix.times_old(A,B); #old
elapsed time: 0.575538209 seconds (915 MB allocated, 7.48% gc time in 2 pauses with 1 full sweep)
julia> @time Base.SparseMatrix.times_old(A,B); #old
elapsed time: 0.574641348 seconds (915 MB allocated, 7.61% gc time in 2 pauses with 1 full sweep)
julia> @time Base.SparseMatrix.times_old(A,B); #old
elapsed time: 0.584391185 seconds (915 MB allocated, 7.42% gc time in 2 pauses with 1 full sweep)
julia>
julia>
julia> @time A.*BF; #new
elapsed time: 0.472082517 seconds (457 MB allocated, 1.39% gc time in 1 pauses with 1 full sweep)
julia> @time A.*BF; #new
elapsed time: 0.469774405 seconds (457 MB allocated, 1.42% gc time in 1 pauses with 1 full sweep)
julia> @time A.*BF; #new
elapsed time: 0.480360273 seconds (457 MB allocated, 1.16% gc time in 1 pauses with 1 full sweep)
julia> @time Base.SparseMatrix.times_old(A,BF); #old
elapsed time: 2.437033787 seconds (1602 MB allocated, 4.56% gc time in 5 pauses with 4 full sweep)
julia> @time Base.SparseMatrix.times_old(A,BF); #old
elapsed time: 2.500815834 seconds (1602 MB allocated, 5.77% gc time in 4 pauses with 3 full sweep)
julia> @time Base.SparseMatrix.times_old(A,BF); #old
elapsed time: 2.464224123 seconds (1602 MB allocated, 5.73% gc time in 4 pauses with 3 full sweep) |
@@ -98,6 +99,207 @@ function gen_broadcast_body_iter(nd::Int, narrays::Int, f::Function) | |||
end | |||
end | |||
|
|||
## Broadcasting cores specialized for returning a SparseMatrixCSC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More common terminology here would be "kernels" rather than "cores"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. I'll correct it. Thanks!
Overall this looks pretty good, the |
About the ~20% difference in There was also ~10% difference in
I thought it was just a matter of style and omitted it when adapting the code. After changing it, the old and the new version seem to have undistinguishable running times. I'll submit a corrected version. |
Yeah, unfortunately manually hoisting field access into temporary variables does make a difference right now. Eventually Julia's compiler might get smart enough (edit: or aliasing assumptions in various types get adjusted) to do that kind of thing automatically, but for now it's a little bit of manual code ugliness in exchange for better performance. I should've mentioned this in some code comments, at least you were able to identify it yourself. Also have a look at #10536 which is somewhat related (sparse reductions rather than broadcast), be aware that the bootstrapping order changes here and there might result in some conflicts that would need to be rebased depending which gets merged first. |
I don't think it is realistic to move sparse support out of base. What is likely is to have new sparse data structure implementations in packages and make it easy to do so. |
Sparse matrices are one of those things like bigints/bigfloats, FFT's, and hell even blas and lapack, that shouldn't really be required runtime dependencies for every piece of code written in the language - if you're not using the functionality then these aren't absolutely necessary for the rest of the language to function. This is 1.0-type stuff, but eventually there should be ways of selectively disabling big chunks of what's currently in base, for embedding and static compilation use cases. The shorter-term (still probably not until 0.5 or later I'm guessing) default situation is we'll probably want to take SuiteSparse out into a package for licensing reasons, but base will need to provide a scaffolding for different packages to be able to plug in various different sparse formats and solver libraries. |
This last commit includes the changes we've been commenting on, and also a reorganization of the new code (moving it from |
|
||
## element-wise comparison operators returning SparseMatrixCSC ## | ||
.<{Tv1,Ti1,Tv2,Ti2}(A_1::SparseMatrixCSC{Tv1,Ti1}, A_2::SparseMatrixCSC{Tv2,Ti2}) = broadcast!(<, spzeros( Bool, promote_type(Ti1, Ti2), broadcast_shape(A_1, A_2)...), A_1, A_2) | ||
.!={Tv1,Ti1,Tv2,Ti2}(A_1::SparseMatrixCSC{Tv1,Ti1}, A_2::SparseMatrixCSC{Tv2,Ti2}) = broadcast!(!=, spzeros( Bool, promote_type(Ti1, Ti2), broadcast_shape(A_1, A_2)...), A_1, A_2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind!=
returns true on entries that are zero in both inputs, so this generally won't return sparse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand... wouldn't that be .==
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, too early in the morning, you're right
Looks pretty good, thanks for rebasing. It will fail on appveyor due to a hasty broken-on-windows merge yesterday, sorry. I don't think |
Here is a version where |
I like it. I think it would be worth squashing these commits. @ViralBShah @simonster any feedback? |
Any news about this? (Just a friendly reminder). |
Thanks for the heads up. I really would prefer to see all the broadcasting code go into sparsematrix.jl. Is there a reason to still have some code in broadcast.jl? Also, can you squash these commits too and rebase to the latest master. I do want to get it merged. Thanks for persisting. |
@tanmaykm Could you please review this PR too? |
e6d295d
to
df141bc
Compare
Hi! |
@amartgon Thanks for this work and for being patient. |
Great! Thanks to all who provided helpful comments! |
I believe this fixes the problem with Vector .* Sparse and Sparse .* Vector (#8416 and #9664)