broadcast[!] over combinations of scalars and sparse vectors/matrices #19724

Sacha0 · 2016-12-27T04:48:40Z

(Requires #19667, #19723, and #19690, which form the first four commits / most of the diff.)

This pull request extends sparse broadcast[!] to handle combinations of scalars and sparse vectors/matrices. The tl;dr good: It works, and with #19667/#19273 the overall return el/type is inferred consistently. The tl;dr not so good: Inference seems unhappy in another way in some cases, impacting performance. The details:

This PR's general approach is to: (1) capture the scalar arguments to broadcast[!] in a closure and simultaneously collect the sparse vector/matrix arguments; and then (2) pass that closure and the sparse vector/matrix arguments to existing broadcast[!] methods that accept solely sparse vector/matrix arguments.

With #19667/#19723, _return_type/_broadcast_eltype is inferable and yields the correct result eltype when called on the closure and collected sparse vector/matrix arguments described above. Hence the overall return el/type is correct. Unfortunately, in some cases inference seems unable to determine the return type of direct calls to that closure, in those cases causing allocation in both the broadcast[!] entry points and the underlyin routines (_map_zeropres!, _broadcast_zeropres!) and lackluster performance. (Will try to construct an MRE at some point.)

This solution's general finickiness drives me to the conclusion that we should instead handle scalar arguments directly in the underlying routines. I do not know whether I will have time for another rewrite of the underlying routines before feature freeze. So I suggest we incorporate this (correct but performance-suboptimal) implementation before feature freeze, allowing another underlying routine rewrite for performance after feature freeze. Best!

(Edit: Also the closure construction allocates a bit, not sure why. Pointers much appreciated.)

nalimilan · 2016-12-27T15:23:56Z

Do inference failures affect common cases or only very specific ones? As long as the behavior is correct, we could merge this before the feature freeze, and improve performance later.

stevengj · 2016-12-27T21:41:14Z

base/sparse/higherorderfns.jl

+broadcast{Tf,T}(f::Tf, ::Type{T}, A::SparseMatrixCSC) = broadcast(y -> f(T, y), A)
+broadcast{Tf,T}(f::Tf, A::SparseMatrixCSC, ::Type{T}) = broadcast(x -> f(x, T), A)
+
+end


missing newline

Fixed in #19690. (Commits three and four are #19690.) Thanks!

stevengj · 2016-12-27T21:42:14Z

base/sparse/higherorderfns.jl

+    _defargforcol_all(j, tail(isemptys), tail(expandsverts), tail(ks), tail(As))...)
+# fusing the following defs. avoids a few branches and construction of a tuple, yielding 1-20% runtime reduction
+# @inline _isactiverow(activerow, row) = row == activerow
+# @inline _isactiverow_all(activerow, ::Tuple{}) = ()


What are all of these commented-out definitions?

The commented-out definitions are what I fused into _fusedupdatebc (for performance) per the comment on line 797. I left the unfused definitions in place (along with commented out calls to those methods before calls to _fusedupdatebc) as an easier-to-follow form of the fused version. Best! (These commits were part of #19518.)

stevengj · 2016-12-27T21:43:08Z

base/sparse/higherorderfns.jl

+# @inline _updaterow_all(rowsentinel, activerows, rows, ks, stopks, As) = (
+#     _updaterow(rowsentinel, first(activerows), first(rows), first(ks), first(stopks), first(As)),
+#     _updaterow_all(rowsentinel, tail(activerows), tail(rows), tail(ks), tail(stopks), tail(As))...)
+@inline function _fusedupdatebc(rowsentinel, activerow, row, defarg, k, stopk, A)


Some comment on what _fusedupdate is supposed to do would be helpful.

Please see my response above. Best!

stevengj · 2016-12-27T21:44:42Z

base/sparse/higherorderfns.jl

+# argument tuple (passedargstup) containing only the sparse vectors/matrices in mixedargs
+# in their orginal order, and such that the result of broadcast(g, passedargstup...) is
+# broadcast(f, mixedargs...)
+@inline capturescalars(f, mixedargs) =


Is this useful to do (as an optimization) for non-sparse broadcast as well?

Were the inference and allocation issues impacting performance (see the original post and this comment in test/sparse/higherorderfns.jl:244) fixed, possibly. Though as in the original post's last paragraph, I plan to head the other direction for now. Best!

stevengj · 2016-12-27T22:01:19Z

@nanosoldier runbenchmarks(ALL, vs=":master")

Sacha0 · 2016-12-27T23:01:23Z

Do inference failures affect common cases or only very specific ones?

Most cases involving one or two sparse vector/matrix arguments and a small number of scalars seem to avoid the inference issue, though they do hit the (much less significant) issue (1) mentioned in this comment in test/sparse/higherorderfns.jl:244). Most cases involving three or more sparse vector/matrix arguments hit the inference issue.

As long as the behavior is correct, we could merge this before the feature freeze, and improve performance later.

:)

nanosoldier · 2016-12-28T03:44:48Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

stevengj · 2016-12-28T16:02:42Z

The nanosoldier regression seems to be the usual floatexp test noise, so it should be fine to merge.

tkelman · 2016-12-28T16:34:16Z

the pr's that make up the first several commits should be finished first, otherwise we're splitting review of them in multiple places

Sacha0 · 2016-12-31T23:12:56Z

Rebased out commits from merged pull requests. Now depends only on #19723 or #19787+#19723. Best!

tkelman · 2017-01-01T01:09:42Z

test/sparse/higherorderfns.jl

+            #   that incorporates the scalar arguments to broadcast (and, with #19667,
+            #   is inferable, so the overall return type from broadcast is inferred),
+            #   in some cases inference seems unable to determine the return type of
+            #   direct calls to that closure. This issue causes variables in in both the


in in duplicated

Fixed on push. Thanks!

tkelman · 2017-01-01T01:10:48Z

test/sparse/higherorderfns.jl

+            #   in some cases inference seems unable to determine the return type of
+            #   direct calls to that closure. This issue causes variables in in both the
+            #   broadcast[!] entry points (fofzeros = f(_zeros_eltypes(args...)...)) and
+            #   the driver routines (Cx in _map_zeropres! and _broadcast_zeroprs!) to have


zeroprs missing a second e?

Fixed on push. Thanks!

Sacha0 · 2017-01-01T08:44:18Z

Rebased and replaced the first commit with the latest version of #19723. Best!

tkelman · 2017-01-01T11:52:41Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2017-01-01T15:29:48Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

tkelman · 2017-01-02T16:48:10Z

base/sparse/higherorderfns.jl

+# capturescalars takes a function (f) and a tuple of mixed sparse vectors/matrices and
+# broadcast scalar arguments (mixedargs), and returns a function (parevalf) and a reduced
+# argument tuple (passedargstup) containing only the sparse vectors/matrices in mixedargs
+# in their orginal order, and such that the result of broadcast(g, passedargstup...) is


is g the same as parevalf ? is that par-eval or pare-val?

Yes, and par-eval as in partial evaluation. The former fixed and the latter clarified on push. Thanks!

…and sparse vectors/matrices.

Sacha0 · 2017-01-08T20:29:31Z

Absent objections or requests for time, I plan to merge this Monday morning PST. Best!

pabloferz · 2017-01-09T03:23:05Z

base/broadcast.jl

-@inline function broadcast!{N}(f, C::AbstractArray, A, Bs::Vararg{Any,N})
+@inline broadcast!{N}(f, C::AbstractArray, A, Bs::Vararg{Any,N}) =
+    broadcast!_c(f, containertype(C, A, Bs...), C, A, Bs...)
+@inline function broadcast!_c{N}(f, ::Type, C::AbstractArray, A, Bs::Vararg{Any,N})


What do you think of broadcast_c! instead?

No preference; I can see arguments for each form. Thoughts? Thanks!

I like broadcast_c! better, but however you prefer is fine.

Changed broadcast!_c to broadcast_c! throughout. Thanks!

pabloferz · 2017-01-09T04:06:12Z

This solution's general finickiness drives me to the conclusion that we should instead handle scalar arguments directly in the underlying routines.

I'm somewhat hesitating between following that approach or implementing some kind of optimization like the one in this PR for broadcasting methods in general. For one, this PR's mechanism is a way to avoid problems with ::Type arguments which are somewhat problematic (see #19849 and #19879). On the other hand, the Broadcast approach (handling scalars directly in the routines) would probably avoid some of the problems you saw with inference. I'm not sure (yet) if a mixed approach between the two solutions is plausible.

For now I'd go with this solution anyway and would eventually revisit other options.

…rse vectors/matrices. Makes broadcast! dispatch on container type (as broadcast), and inject generic sparse broadcast! for the appropriate container type.

…rse vectors/matrices.

Sacha0 · 2017-01-09T19:26:08Z

Thanks all!

Sacha0 · 2017-01-09T20:29:12Z

Master is failing tests as of this pull request (particularly those introduced by this pull request). I conjecture that #19922 (merged after this pull request's last CI run) broke this pull request. Should have a fix shortly if the former is true. Best!

.

Fix simultaneous merge issue between #19724 and #19922

Sacha0 force-pushed the mixedbc branch from 9bbd3f0 to 20fe9fa Compare December 27, 2016 05:59

kshyatt added sparse Sparse arrays broadcast Applying a function over a collection labels Dec 27, 2016

stevengj reviewed Dec 27, 2016

View reviewed changes

Sacha0 mentioned this pull request Dec 31, 2016

Simplify and extend broadcast eltype promotion mechanism #19723

Merged

Sacha0 force-pushed the mixedbc branch from 20fe9fa to dcb7d45 Compare December 31, 2016 23:10

tkelman reviewed Jan 1, 2017

View reviewed changes

Sacha0 force-pushed the mixedbc branch from dcb7d45 to b93c864 Compare January 1, 2017 01:27

Sacha0 added this to the 0.6.0 milestone Jan 1, 2017

Sacha0 force-pushed the mixedbc branch from b93c864 to 3d9fc3c Compare January 1, 2017 08:43

tkelman force-pushed the mixedbc branch from 3d9fc3c to 72a86ee Compare January 1, 2017 12:17

tkelman reviewed Jan 2, 2017

View reviewed changes

tkelman mentioned this pull request Jan 6, 2017

Some broadcast performance tweaks #19879

Merged

Sacha0 added 2 commits January 7, 2017 14:12

Extend sparse broadcast (non-!) to combinations of broadcast scalars …

9820ced

…and sparse vectors/matrices.

Test sparse broadcast (non-!) over combinations of broadcast scalars …

b9cc65b

…and sparse vectors/matrices.

Sacha0 force-pushed the mixedbc branch from 72a86ee to 27f392c Compare January 7, 2017 22:25

Sacha0 mentioned this pull request Jan 8, 2017

extend sparse map[!]/broadcast[!] to structured matrices #19926

Merged

pabloferz reviewed Jan 9, 2017

View reviewed changes

Sacha0 added 2 commits January 8, 2017 20:22

Extend sparse broadcast! to combinations of broadcast scalars and spa…

746dbb0

…rse vectors/matrices. Makes broadcast! dispatch on container type (as broadcast), and inject generic sparse broadcast! for the appropriate container type.

Test sparse broadcast! over combinations of broadcast scalars and spa…

ce545a6

…rse vectors/matrices.

Sacha0 force-pushed the mixedbc branch from 27f392c to ce545a6 Compare January 9, 2017 04:23

Sacha0 merged commit 1494f43 into JuliaLang:master Jan 9, 2017

Sacha0 deleted the mixedbc branch January 9, 2017 19:26

Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 9, 2017

Fix simultaneous merge issue between JuliaLang#19724 and JuliaLang#19922

d1a9cb0

.

Sacha0 mentioned this pull request Jan 9, 2017

Fix simultaneous merge issue between #19724 and #19922 #19953

Merged

Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 9, 2017

Fix simultaneous merge issue between JuliaLang#19724 and JuliaLang#19922

fc05f1d

.

vtjnash added a commit that referenced this pull request Jan 10, 2017

Merge pull request #19953 from Sacha0/fixmixedbc

6b1b4f6

Fix simultaneous merge issue between #19724 and #19922

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

broadcast[!] over combinations of scalars and sparse vectors/matrices #19724

broadcast[!] over combinations of scalars and sparse vectors/matrices #19724

Sacha0 commented Dec 27, 2016 •

edited

Loading

nalimilan commented Dec 27, 2016

stevengj Dec 27, 2016

Sacha0 Dec 27, 2016

stevengj Dec 27, 2016

Sacha0 Dec 27, 2016 •

edited

Loading

stevengj Dec 27, 2016

Sacha0 Dec 27, 2016

stevengj Dec 27, 2016 •

edited

Loading

Sacha0 Dec 27, 2016

stevengj commented Dec 27, 2016

Sacha0 commented Dec 27, 2016

nanosoldier commented Dec 28, 2016

stevengj commented Dec 28, 2016

tkelman commented Dec 28, 2016

Sacha0 commented Dec 31, 2016

tkelman Jan 1, 2017

Sacha0 Jan 1, 2017

tkelman Jan 1, 2017

Sacha0 Jan 1, 2017

Sacha0 commented Jan 1, 2017

tkelman commented Jan 1, 2017

nanosoldier commented Jan 1, 2017

tkelman Jan 2, 2017 •

edited

Loading

Sacha0 Jan 7, 2017

Sacha0 commented Jan 8, 2017

pabloferz Jan 9, 2017 •

edited

Loading

Sacha0 Jan 9, 2017

pabloferz Jan 9, 2017

Sacha0 Jan 9, 2017

pabloferz commented Jan 9, 2017

Sacha0 commented Jan 9, 2017

Sacha0 commented Jan 9, 2017

broadcast[!] over combinations of scalars and sparse vectors/matrices #19724

broadcast[!] over combinations of scalars and sparse vectors/matrices #19724

Conversation

Sacha0 commented Dec 27, 2016 • edited Loading

nalimilan commented Dec 27, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 Dec 27, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevengj Dec 27, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevengj commented Dec 27, 2016

Sacha0 commented Dec 27, 2016

nanosoldier commented Dec 28, 2016

stevengj commented Dec 28, 2016

tkelman commented Dec 28, 2016

Sacha0 commented Dec 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 commented Jan 1, 2017

tkelman commented Jan 1, 2017

nanosoldier commented Jan 1, 2017

tkelman Jan 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 commented Jan 8, 2017

pabloferz Jan 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pabloferz commented Jan 9, 2017

Sacha0 commented Jan 9, 2017

Sacha0 commented Jan 9, 2017

Sacha0 commented Dec 27, 2016 •

edited

Loading

Sacha0 Dec 27, 2016 •

edited

Loading

stevengj Dec 27, 2016 •

edited

Loading

tkelman Jan 2, 2017 •

edited

Loading

pabloferz Jan 9, 2017 •

edited

Loading