unify reduce/reducedim empty case behaviors #55628

mbauman · 2024-08-29T14:33:53Z

Previously, Julia tried to guess at initial values for empty dimensional reductions in a slightly different way than whole-array reductions. This change unifies those behaviors, such that mapreduce_empty is called for empty dimensional reductions, just like it is called for empty whole-array reductions.

Beyond the unification (which is valuable in its own right), this change enables some downstream simplifications to dimensional reductions and is likely to be the "most breaking" public behavior in a refactoring targeting more correctness improvements (#55318).

It's a little convoluted to understand the exact motivation, utility, and net effect of this change. Here are two case studies, demonstrating the behaviors prior to this change:

julia> minimum(zeros(0,42))
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> minimum(zeros(0,42), dims=1)
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> minimum(zeros(0,1), dims=2)
0×1 Matrix{Float64}

julia> sum(x->pi, zeros(0,1))
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> sum(x->pi, zeros(0,1), dims=1)
1×1 Matrix{Int64}:
 0

julia> sum(x->pi, zeros(0,1), dims=2)
0×1 Matrix{Int64}

This PR makes all six cases above errors. This is a useful and important change because it's precisely this guessing of eltype and initial array values that lead to the many correctness errors that #55318 fixes. It's also worth noting that the incremental widening approach in #55318 could support the empty-return case, but it would do so at the expense of type stability because the "reducing over an empty collection" error branch adds a possible Array{Union{}} return value.

Previously, Julia tried to guess at initial values for empty dimensional reductions in a slightly different way than whole-array reductions. This change unifies those behaviors, such that `mapreduce_empty` is called for empty dimensional reductions, just like it is called for empty whole-array reductions. Beyond the unification (which is valuable in its own right), this change enables some downstream simplifications to dimensional reductions and is likely to be the "most breaking" public behavior in a refactoring targeting more correctness improvements.

mbauman · 2024-08-29T14:34:35Z

Once CI passes (modulo a SparseArrays test), let's run Nanosoldier.

mbauman · 2024-08-29T17:22:42Z

@nanosoldier runtests()

nanosoldier · 2024-09-01T03:12:29Z

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

mbauman · 2024-09-03T14:37:40Z

OK, that went better than I expected. There are three root failures as far as I can see:

GMT v1.14.2 calls extrema(dest::Matrix{Float64}, dims=1) as it does its precompile workload. Interestingly, the latest release (v.17) does not do this for its precompile work, so this only affects GeophysicalModelGenerator.jl which pins it to the old version somehow. Later versions don't precompile this, but it still causes test failures.
PeriodicGraphs calls maximum(denominator.(_invmat::Matrix{Rational{Int}}); dims=2) as part of its precompile workflow.
OnlinePortfolioSelection callsmaximum(p[idx_under_zero, :], dims=2).

mbauman · 2024-09-03T16:52:55Z

OK, that's fascinating. All of those examples would be fixed by #52004, which was only blocked by precisely the same sparse array failures that this PR faces. I'm not a huge fan of #52004, but perhaps there's a middle ground by just preserving the eltype for minimum/maximum/extrema for the empty return array case.

JeffBezanson · 2024-09-12T02:05:31Z

Triage is ok with making all of these errors. The result for minimum(zeros(0,1), dims=2) is arguably correct, but that only applies to functions that pick an element from the array rather than applying a more general function, so I see we may not be able to accommodate that.

mbauman added breaking This change will break code fold sum, maximum, reduce, foldl, etc. labels Aug 29, 2024

mbauman added the needs pkgeval Tests for all registered packages should be run with this change label Aug 29, 2024

mbauman mentioned this pull request Aug 29, 2024

WIP: a more principled take on dimensional reduction inits #55318

Open

mbauman added the triage This should be discussed on a triage call label Sep 3, 2024

adienes mentioned this pull request Sep 11, 2024

Allow empty reductions for maximum(Unsigned) and compose #44702

Open

LilithHafner removed the triage This should be discussed on a triage call label Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unify reduce/reducedim empty case behaviors #55628

unify reduce/reducedim empty case behaviors #55628

mbauman commented Aug 29, 2024 •

edited

Loading

mbauman commented Aug 29, 2024 •

edited

Loading

mbauman commented Aug 29, 2024

nanosoldier commented Sep 1, 2024

mbauman commented Sep 3, 2024 •

edited

Loading

mbauman commented Sep 3, 2024 •

edited

Loading

JeffBezanson commented Sep 12, 2024

unify reduce/reducedim empty case behaviors #55628

Are you sure you want to change the base?

unify reduce/reducedim empty case behaviors #55628

Conversation

mbauman commented Aug 29, 2024 • edited Loading

mbauman commented Aug 29, 2024 • edited Loading

mbauman commented Aug 29, 2024

nanosoldier commented Sep 1, 2024

mbauman commented Sep 3, 2024 • edited Loading

mbauman commented Sep 3, 2024 • edited Loading

JeffBezanson commented Sep 12, 2024

mbauman commented Aug 29, 2024 •

edited

Loading

mbauman commented Aug 29, 2024 •

edited

Loading

mbauman commented Sep 3, 2024 •

edited

Loading

mbauman commented Sep 3, 2024 •

edited

Loading