-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and document sorting dispatch #47383
Conversation
12d136b
to
16fc0a0
Compare
Nice. There seems to be a performance regression for julia> @btime sortperm(x) setup=(x=rand(1000));
29.040 μs (6 allocations: 23.86 KiB) => 57.107 μs (5 allocations: 15.92 KiB) (Btw, it seems possible to speed up |
Good catch. There was a typo that prevented proper dispatch. Fixing the typo resolves almost all of the regression. julia> @btime sortperm(x) setup=(x=rand(1000));
31.543 μs (6 allocations: 23.86 KiB) => 61.839 μs (5 allocations: 15.92 KiB) => 36.477 μs (9 allocations: 16.08 KiB) |
base/sort.jl
Outdated
Finally, if the input has length less than 80, we dispatch to [`InsertionSort`](@ref) and | ||
otherwise we dispatch to [`QuickSort`](@ref). | ||
""" | ||
const DEFAULT_STABLE = InitialOptimizations(IsUIntMappable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems quite hard to read (and modify). Would it be possible to simplify it for the reader, smth like as follows?
const DEFAULT_STABLE = begin
alg1 = QuickSort |> Small{80} |> ConsiderRadixSort |> ConsiderCountingSort |> ComputeExtrema |> CheckSorted |> Small{40}
IsUIntMappable(alg1, StableCheckSorted(QuickSort)) |> InitialOptimizations
end
Also the outputed algorithm is kind of unusefull for the user :-D
julia> Base.Sort.defalg([1,2])
Base.Sort.MissingOptimization{Base.Sort.BoolOptimization{Small{10, Base.Sort.InsertionSortAlg, Base.Sort.IEEEFloatOptimization{Base.Sort.IsUIntMappable{Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}, Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}(Base.Sort.BoolOptimization{Small{10, Base.Sort.InsertionSortAlg, Base.Sort.IEEEFloatOptimization{Base.Sort.IsUIntMappable{Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}, Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}(Small{10, Base.Sort.InsertionSortAlg, Base.Sort.IEEEFloatOptimization{Base.Sort.IsUIntMappable{Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}, Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}(Base.Sort.InsertionSortAlg(), Base.Sort.IEEEFloatOptimization{Base.Sort.IsUIntMappable{Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}, Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}(Base.Sort.IsUIntMappable{Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}, Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}(Small{40, Base.Sort.InsertionSortAlg, Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}}(Base.Sort.InsertionSortAlg(), Base.Sort.CheckSorted{Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}}(Base.Sort.ComputeExtrema{Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}}(Base.Sort.ConsiderCountingSort{Base.Sort.CountingSort, ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}}(Base.Sort.CountingSort(), ConsiderRadixSort{Base.Sort.RadixSort, Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}}(Base.Sort.RadixSort(), Small{80, Base.Sort.InsertionSortAlg, PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}(Base.Sort.InsertionSortAlg(), PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}(missing, missing, Base.Sort.InsertionSortAlg()))))))), Base.Sort.StableCheckSorted{PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}}(PartialQuickSort{Missing, Missing, Base.Sort.InsertionSortAlg}(missing, missing, Base.Sort.InsertionSortAlg())))))))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... that output is especially terrible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to reverse the order with |>
because the standard order reflects the order of operations. MissingOptimization(InsertionSort())
first tries to perform a missing optimization and then performs insertion sort.
I've reformatted the const declarations and added pretty printing:
const DEFAULT_STABLE = InitialOptimizations(
IsUIntMappable(
Small{40}(
CheckSorted(
ComputeExtrema(
ConsiderCountingSort(
ConsiderRadixSort(
Small{80}(
QuickSort)))))),
StableCheckSorted(
QuickSort)))
and
julia> Base.DEFAULT_STABLE
Base.Sort.MissingOptimization(
Base.Sort.BoolOptimization(
Base.Sort.Small{10}(
InsertionSort(),
Base.Sort.IEEEFloatOptimization(
Base.Sort.IsUIntMappable(
Base.Sort.Small{40}(
InsertionSort(),
Base.Sort.CheckSorted(
Base.Sort.ComputeExtrema(
Base.Sort.ConsiderCountingSort(
Base.Sort.CountingSort(),
Base.Sort.ConsiderRadixSort(
Base.Sort.RadixSort(),
Base.Sort.Small{80}(
InsertionSort(),
PartialQuickSort(missing, missing,
InsertionSort()))))))),
Base.Sort.StableCheckSorted(
PartialQuickSort(missing, missing,
InsertionSort())))))))
It's still fairly large, but I don't think there's any way of getting around that without either simplifying dispatch (possible a good idea for a future PR) or moving many optimization passes outside the scope of DEFAULT_STABLE
(and therefore not easily configured by a user)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reformatted the const declarations and added pretty printing:
Thanks, it's definitely better. Btw, this seems like a nice case for reversed <|
operator.
if _issorted(v, lo, hi, o) | ||
return v | ||
elseif _issorted(v, lo, hi, Lt((x, y) -> !lt(o, x, y))) | ||
# Reverse only if necessary. Using issorted(..., Reverse(o)) would violate stability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems possible to use issorted(..., Reverse(o))
and then implement _stablereverse!(v, lo, hi, o)
that would handle the stability such that a segment of equal elements is reversed back at the end. This all is O(n) but outside of this PR's scope.
Probably good with a PkgEval here before merging. |
3e2420e
to
07e93f9
Compare
…roduces regressions.)
FIXES UNEXPECTED ALLOCATIONS removes code that previously harbored bugs that slipped through the test suite
Fixes a few remaining unexpected allocations U can be statically computed from the type of v and order so there is no need. Further, U is infered as ::DataType rather than Type{U} which causes type instabilities.
it is invalid to cache lenm1 because lo and hi may be redefined and we have no cache invalidation system
fixes #47474 in this PR rather than separate to avoid dealing with the merge
07e93f9
to
f06de10
Compare
NFC rebase to clean up the history a bit and get up to date with master for nanosoldier. The final commit worries me a bit; I have no idea why we need to define all four defalg methods to avoid hanging on doctests.
|
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
Breaks SortingAlgorithms which has 2033 direct + indirect deps. |
Rerunning failed PkgEval now that SortingAlgorithms 3.2 has been released to work around compat bounds restricting to below the non-breaking 1.0 release. @nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
A total of 4 PkgEval failures remain, three of which have identified non-blocking causes and two of which seem unrelated but have no known cause. Package has test failures (3 packages): Recommenders v0.1.2: fail vs. ok - Error while sorting a SparseArray that previously produced a dense result. There is an open PR to fix this, but a maintainer has indicated that they prefer an error in this case, so I do not view this as a regression. SortingAlgorithms v1.1.0: fail vs. ok - Due to testing that HeapSort, and unstable sorting algorithm, preserves the order of input NaNs. We can loosen the test or adjust the hook into Julia's internals to maintain this property, but either way it is not a behavior that anyone downstream depends on (at least not in PkgEval). There were unidentified errors (1 packages): I'm going to investigate the final failure further, but unless I find a regression related to this PR or anyone objects, I plan to merge this soon. This is the diff since @KristoferC's "Probably good with a PkgEval here before merging." It contains added tests and bug fixes from PkgEval, simplifications, renaming buffer => scratch, and better scratch space handling to architecture to reduce allocations and support future improvements. |
Rerunning the package whose test failure confuses me: @nanosoldier |
Your package evaluation job has completed - no new issues were detected. A full report can be found here. |
I also couldn't reproduce that package's test failures locally, so this LGTM |
I expect this to fail; though it worked on the prior commit: @nanosoldier |
Your job failed. |
* create an internal `_sort!` function and use it (rename the existing `_sort!` to `__sort!`) * test for several of bugs that slipped through test suite * Give each sorting pass and DEFAULT_STABLE a docstring * add pretty printing for the new algorithms that are much more flexible than the old ones * fix unexpected allocations in Radix Sort fixes #47474 in this PR rather than separate to avoid dealing with the merge * support and test backwards compatibility with packages that depend in sorting internals * support 3-, 5-, and 6-argument sort! for backwards compatibility * overhall scratch space handling make _sort! return scratch space rather than sorted vector so that things like IEEEFloatOptimization can re-use the scratch space allocated on their first recursive call * test handling -0.0 in IEEEFloatOptimization * fix and test bug where countsort's correct overflow behavior triggers error due to unexpected promotion to UInt (cherry picked from commit cee0a04)
This PR broke |
* add test demonstrating overflow in countsort * fix overflow in countsort * remove unnecessary type annotations (fixes tests) This fixes the test failure because it allows for automatic conversion. The manual for implementing the AbstractArray interface also does not recomend a type signature for the value arg in setindex!. Co-authored-by: Lilith Hafner <Lilith.Hafner@gmail.com>
Did this end up getting any nanosoldier testing? |
No it didn't and yikes those regressions are bad. |
* add test demonstrating overflow in countsort * fix overflow in countsort * remove unnecessary type annotations (fixes tests) This fixes the test failure because it allows for automatic conversion. The manual for implementing the AbstractArray interface also does not recomend a type signature for the value arg in setindex!. Co-authored-by: Lilith Hafner <Lilith.Hafner@gmail.com> (cherry picked from commit 965bc7d)
Just looked into this and the reason we have regressions is that this PR makes We didn't catch these regression earlier because they only effect usage that explicitly specifies |
Before, sorting dispatch used an ad-hoc system. I won't describe it here.
In this PR, sorting dispatch is specified by a nested chain of algorithms. For example,
MissingOptimization(BoolOptimization(Small{10}(IEEEFloatOptimization(QuickSort))))
will attempt optimizations to handleeltypes
that are unions withMissing
and those that areBool
. Then it will dispatch to a small algorithm for vectors shorter than 10 elements, perform an optimization for eltypes that areIEEFloat
s, and finally, performQuickSort
.This is a major refactoring, and part of a larger plan specified here. It should make future development more pleasant, improve the interactions with third party sorting libraries (e.g. SortingAlgorithms.jl), and facilitate future removal of many of the optimizations from base to put them in a stdlib.
For the most part, I attempted to limit observable changes to dispatch but I did separate missing optimizations from IEEE floating point optimizations. I kept a fair bit of code re-use, but they are fundamentally separate optimizations, and separating them closes #27781.
Further, I made the missing optimization call downstream algorithms with an abstract vector with eltype that does not include
Missing
. This allows that optimization to be combined with other optimizations like that forBool
s and theUIntMappable
suite.One big difference between
sort!
and the_sort!
function that this PR defines is that_sort!
returns a scratch space (if any) rather than the sorted vector. Right now, that features is only lightly used, but it makes it possible to re-use scratch space allocated within a sorting method without pre-allocating beforehand. This is nice because it is often not obvious whether a sorting method will need scratch space at all. This fixes #47538, though it is likely not the only possible way to fix that issue.I ran several benchmarks, many of which report runtime improvements.
Benchmark results EDITED to keep up to date with the PR
Fixes #47721
Unrelatedly, fixes #47474, closes #47383