Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unify reduce/reducedim empty case behaviors #55628

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mbauman
Copy link
Member

@mbauman mbauman commented Aug 29, 2024

Previously, Julia tried to guess at initial values for empty dimensional reductions in a slightly different way than whole-array reductions. This change unifies those behaviors, such that mapreduce_empty is called for empty dimensional reductions, just like it is called for empty whole-array reductions.

Beyond the unification (which is valuable in its own right), this change enables some downstream simplifications to dimensional reductions and is likely to be the "most breaking" public behavior in a refactoring targeting more correctness improvements (#55318).

It's a little convoluted to understand the exact motivation, utility, and net effect of this change. Here are two case studies, demonstrating the behaviors prior to this change:

julia> minimum(zeros(0,42))
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> minimum(zeros(0,42), dims=1)
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> minimum(zeros(0,1), dims=2)
0×1 Matrix{Float64}
julia> sum(x->pi, zeros(0,1))
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

julia> sum(x->pi, zeros(0,1), dims=1)
1×1 Matrix{Int64}:
 0

julia> sum(x->pi, zeros(0,1), dims=2)
0×1 Matrix{Int64}

This PR makes all six cases above errors. This is a useful and important change because it's precisely this guessing of eltype and initial array values that lead to the many correctness errors that #55318 fixes. It's also worth noting that the incremental widening approach in #55318 could support the empty-return case, but it would do so at the expense of type stability because the "reducing over an empty collection" error branch adds a possible Array{Union{}} return value.

Previously, Julia tried to guess at initial values for empty dimensional reductions in a slightly different way than whole-array reductions. This change unifies those behaviors, such that `mapreduce_empty` is called for empty dimensional reductions, just like it is called for empty whole-array reductions.

Beyond the unification (which is valuable in its own right), this change enables some downstream simplifications to dimensional reductions and is likely to be the "most breaking" public behavior in a refactoring targeting more correctness improvements.
@mbauman mbauman added breaking This change will break code fold sum, maximum, reduce, foldl, etc. labels Aug 29, 2024
@mbauman
Copy link
Member Author

mbauman commented Aug 29, 2024

Once CI passes (modulo a SparseArrays test), let's run Nanosoldier.

@mbauman mbauman added the needs pkgeval Tests for all registered packages should be run with this change label Aug 29, 2024
@mbauman
Copy link
Member Author

mbauman commented Aug 29, 2024

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@mbauman
Copy link
Member Author

mbauman commented Sep 3, 2024

OK, that went better than I expected. There are three root failures as far as I can see:

@mbauman
Copy link
Member Author

mbauman commented Sep 3, 2024

OK, that's fascinating. All of those examples would be fixed by #52004, which was only blocked by precisely the same sparse array failures that this PR faces. I'm not a huge fan of #52004, but perhaps there's a middle ground by just preserving the eltype for minimum/maximum/extrema for the empty return array case.

@mbauman mbauman added the triage This should be discussed on a triage call label Sep 3, 2024
@JeffBezanson
Copy link
Member

Triage is ok with making all of these errors. The result for minimum(zeros(0,1), dims=2) is arguably correct, but that only applies to functions that pick an element from the array rather than applying a more general function, so I see we may not be able to accommodate that.

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code fold sum, maximum, reduce, foldl, etc. needs pkgeval Tests for all registered packages should be run with this change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants