Stabilize, optimize, and increase robustness of QuickSort #45222

LilithHafner · 2022-05-07T15:40:47Z

"Quicksort is unstable and attempts to stabilize it hurt performance" (citation needed).

I believe I have simultaneously stabilized and sped up quicksort, which should allow us to simplify sorting policy.

Instability comes from the partition algorithm. We currently use something like this

while i < j
    while(v[i] < pivot) i += 1 end
    while(v[j] > pivot) j -= 1 end
    v[i], v[j] = v[j], v[i]
end

Which is an efficient in place unstable partition algorithm.

If we allow ourselves scratch space to conduct the partition, we can use something like this instead

for x in v
    if x < pivot
        t[i] = x
        i += 1
    else
        t[j] = x
        j -= 1
    end
end

This new approach is still unstable, but it is predictably unstable. Everything before the pivot is stable, and everything after it is reversed. After several nested partitions, we still only need to reverse half the elements one time to reintroduce stability.

Benchmarks reveal that this approach is typically faster than the original, possibly because the order of comparisons and computations is known at the beginning of the partition rather than depending on the branches during the partition. In effect, the branches in this new algorithm have less impact on future code execution. I have not profiled this implementation at the CPU level and don't know for sure why it is a speedup.

Caveat 1: I benchmarked on a somewhat inconsistently noisy machine.

Caveat 2: This is sometimes a slowdown. See graphs.

Future work indicated by this PR includes

Move MergeSort to SortingAlgorithms.jl
Now that every sorting algorithm in Base's sorting policy except InsertionSort uses scratch space, it becomes even more valuable to effectively pass scratch-space vectors around. For example, in this PR, fpsort! results in redundant scratch space allocations.
We partition in quicksort and fpsort!. Generalizing partition! and factoring it out should result in substantial readability improvements in fpsort! which, in turn, should facilitate extending support optimized sorting for unions with Missing to all vectors. We could also export partition!.

Closes #11429
Closes #42713
Closes #32675

petvana · 2022-05-08T11:15:52Z

It seems like solving #11429 as a side-effect, but at the cost of losing in-place property.

base/sort.jl

LilithHafner · 2022-05-08T17:16:39Z

It seems like solving #11429 as a side-effect

Good observation! I've added tests to confirm, and indeed it does.

base/sort.jl

LilithHafner · 2022-06-04T13:47:32Z

This depends on workspaces which depend on OffsetArrays. Hold pending availability of OffsetArrays in Sort.jl.

LilithHafner · 2022-06-07T11:01:28Z

We're proceeding with workspaces/buffers for sorting without OffsetArrays. We will use a similar approach here as in radix_sort. It will work but be a bit ugly and have a bit of runtime overhead for certain inputs with offset indexing.

mikmoore · 2022-06-14T14:30:28Z

stdlib/Random/src/Random.jl

+# Randomize quicksort pivot selection. This code is here because of bootstrapping:
+# we need to sort things before we load this standard library.
+# TODO move this into Sort.jl
+Base.Sort.select_pivot(lo::Integer, hi::Integer) = rand(lo:hi)


Would it be better to use something other than the global RNG? Perhaps introduce a dedicated stream for sorting operations? I certainly wouldn't anticipate that sort would affect the global RNG and it would take me a while to trace the source of the consumption to here. Technically, a user shouldn't care... but that isn't to say they won't.

I don't think we should try to make the promise that algorithms in Base/Stdlib won't generate random numbers using the global rng. A ton of stuff needs random numbers (eg median, some matrix factorization algorithms (for pivoting), possibly thread scheduling). I think we shouldn't set the expectation that users should care about the global RNG state.

While it would be nice to not alter the global RNG seed, to maintain our "everything is task scheduling independent, even random numbers" guarantee, which, I believe, is much more important than "sorting doesn't change global RNG state", I think we would need to have a different dedicated stream for each task, which is far too expensive; or we would need to branch a new stream from the global RNG without altering the global RNG's state, which could introduce very hard to anticipate correlations between the randomness of sorting and of other generations akin to but much subtler (and less likely to come up) than #6573

In general, I think everything should use the global RNG, and if someone wants a stream nothing else touches, then they are responsible for creating their own dedicated stream.

There is a warning at the top of https://docs.julialang.org/en/v1/stdlib/Random/ that is somewhat relevant, though it doesn't directly apply to this.

Beat me to it :) I totally agree.

This seems to have caused a bug in Zygote: FluxML/Zygote.jl#1351. Possibly this should just be a pure Zygote fix, but I thought it would be good to be aware that this has the potential to cause some bugs in cases where this function was assumed not to affect the RNG state.

it's internal usage in various places is incorrect.

Do you have an example?

In 1.8 we also had pathological (and more easily constructed) quadratic cases

For example, the "sawtooth" ordering:

julia> n = 10^6; julia> x = vcat(3n:4n,2n:3n,n:2n,0:n); julia> @time sort!(shuffle!(x)); 0.477350 seconds julia> x = vcat(3n:4n,2n:3n,n:2n,0:n); julia> @time sort!(x); 134.471621 seconds julia> VERSION v"1.8.5"

I don't have a clean MWE, but there are two clunkier examples in #48230 (Zygote is one breakage, and then the repl which is a theoretical one since we probably get a dispatch to insertionsort in practice)

(And I haven't traced where the precise internal usage is in the case of the REPL.)

But you could imagine it could have a lot of consequences in various non-barebones environments where sort! is used incautiously, e.g. maybe in Pluto notebook, for whatever reason a sort! is run when processing a cell output (this is purely hypothetical). Ideally we'd want the same code with the same seed to run the same in a these environments, which would be in danger if rand is introduced to sort! without carefully examining such cases

The default rng should not be used for reproducible random streams.

It's very common practice to begin a piece of code with Random.seed!(...) to make it reproducible. Faulty a strategy as it may be, it would be confusing if calls to sort! made by non user -facing code broke this. That's why I think those calls should eventually be changed to something which can be relied not to affect the global RNG.

In any case, I don't want to debate this point too much here since I'm not pushing for this to affect 1.9.

This fixes a hang where string interpolation in "precompile(Tuple{typeof(show), $IO, $T})\n" uses sorting which uses rand(lo:hi) which uses an itterated approach which runs forever because every random UInt64 is 0. TODO find a better place for this initialization to prevent this bug from cropping up again perhaps initialize the moment rand is defined (that may be hard).

base/sort.jl

Co-authored-by: Petr Vana <petvana@centrum.cz>

…ry competition

LilithHafner · 2022-10-14T14:04:33Z

Bump <3

oscardssmith · 2022-10-14T14:22:01Z

stdlib/Random/src/Random.jl

@@ -434,4 +434,10 @@ true
 """
 seed!(rng::AbstractRNG, ::Nothing) = seed!(rng)

+# Randomize quicksort pivot selection. This code is here because of bootstrapping:
+# we need to sort things before we load this standard library.
+# TODO move this into Sort.jl


now that the compiler's sort is separated, is this still necessary?

Yes. If someone filters out the Random stdlib, I'd like to use hash-based pivot selection rather than error trying to use rand.

If sorting were split into a small portion in base that works alone and a larger stdlib that pirates base's sorting to make it run a bit faster and depends on Random, then this code would be able to move to the sorting stdlib, so the TODO is still appropriate.

And if you mean can we remove that redefinition entirely, yes we can. But it would come at a performance penalty (likely negligible) and at the cost of allowing pathological inputs to be intentionally created.

oscardssmith

Looks good!

StefanKarpinski · 2022-10-14T20:34:20Z

Now that every sorting algorithm in Base's sorting policy except InsertionSort uses scratch space, it becomes even more valuable to effectively pass scratch-space vectors around. For example, in this PR, fpsort! results in redundant scratch space allocations.

The way you explain this suggests to me that a better keyword for that PR would have been something like scratch since scratch space is the term you use to talk about it 😄

LilithHafner · 2022-10-25T01:55:30Z

c.f. #45222 (comment)

I want to be clear that this algorithm change is not a strict improvement. It is and almost certainly will remain a performance regression in multiple meaningful cases. We will need to make a value judgment about whether the improvements (performance and otherwise) outweigh the regressions.

Stabilize and optimize QuickSort

6230587

petvana reviewed May 8, 2022

View reviewed changes

base/sort.jl Outdated Show resolved Hide resolved

petvana reviewed May 8, 2022

View reviewed changes

base/sort.jl Outdated Show resolved Hide resolved

LilithHafner force-pushed the stabilize-quicksort branch from 115f851 to cde1695 Compare May 8, 2022 17:32

LilithHafner changed the title ~~Stabilize and optimize QuickSort~~ Stabilize, optimize, and increase robustness of QuickSort May 8, 2022

Lilith Hafner added 2 commits May 10, 2022 11:01

test invalid lt to close JuliaLang#11429

3db4a31

Remove redundant reverse_view! thanks @petvana

3318c34

LilithHafner force-pushed the stabilize-quicksort branch from cde1695 to 3318c34 Compare May 10, 2022 16:14

Merge branch 'master' into stabilize-quicksort

1a30832

petvana reviewed May 11, 2022

View reviewed changes

base/sort.jl Outdated Show resolved Hide resolved

Merge branch 'master' into stabilize-quicksort

0010aaf

LilithHafner marked this pull request as draft June 4, 2022 13:47

This was referenced Jun 4, 2022

Make OffsetArrays a stdlib #45585

Closed

Move sorting algorithms from base to stdlib (First try) #45584

Closed

LilithHafner force-pushed the stabilize-quicksort branch from ab83655 to 0010aaf Compare June 7, 2022 10:36

Randomize pivot selection

d002d88

fix whitespace

c241add

This was referenced Jun 8, 2022

Fix-ups for sorting workspace/buffer (#45330) #45570

Merged

Excessive using statements in Sort.jl #45654

Closed

mikmoore reviewed Jun 14, 2022

View reviewed changes

Lilith Hafner added 3 commits June 30, 2022 13:43

Merge branch 'master' into stabilize-quicksort

7b5cafa

style

93b80f6

LilithHafner added the performance Must go faster label Jul 4, 2022

LilithHafner added 2 commits July 3, 2022 22:07

fix doctests (1/2)

e70ae49

Merge branch 'master' into stabilize-quicksort

e8db300

petvana reviewed Oct 12, 2022

View reviewed changes

base/sort.jl Outdated Show resolved Hide resolved

Optimize deterministic select_pivot

24fd62b

Co-authored-by: Petr Vana <petvana@centrum.cz>

LilithHafner mentioned this pull request Oct 12, 2022

Bitonic mergesort JuliaCollections/SortingAlgorithms.jl#62

Open

update radix sort heuristic because quicksort is faster and the prima…

2272c28

…ry competition

oscardssmith reviewed Oct 14, 2022

View reviewed changes

oscardssmith approved these changes Oct 14, 2022

View reviewed changes

LilithHafner merged commit 35431bf into JuliaLang:master Oct 15, 2022

LilithHafner deleted the stabilize-quicksort branch October 15, 2022 02:18

LilithHafner mentioned this pull request Oct 15, 2022

Rename buffer to scratch in sorting #47172

Merged

odow mentioned this pull request Oct 17, 2022

Omit allocation test due to changes in sort on nightly jump-dev/MathOptInterface.jl#2018

Merged

LilithHafner mentioned this pull request Oct 17, 2022

Reduce the range of elements sorted by partialsort #47191

Merged

petvana mentioned this pull request Oct 23, 2022

Document the role of missing in PartialQuickSort and deprecate PartialQuickSort(::Integer) #47297

Closed

This was referenced Oct 24, 2022

Stop incorrectly documenting the default sorting algorithms #47303

Merged

PartialQuickSort needs a compat annotation #47304

Closed

LilithHafner added the regression Regression in behavior compared to a previous version label Oct 25, 2022

This was referenced Nov 1, 2022

Test problems on nightly jump-dev/MathOptInterface.jl#2017

Closed

[Utilities] Re-enable allocation test from #2018 jump-dev/MathOptInterface.jl#2034

Merged

LilithHafner mentioned this pull request Nov 8, 2022

Improve backwards compatibility in sorting #47489

Merged

petvana mentioned this pull request Nov 15, 2022

Doc: The default sorting alg. is stable from 1.9 #47579

Merged

This was referenced Dec 1, 2022

Regression in allocations of partialsort! #47766

Closed

Put back the old QuickSort and PartialQuickSort algorithms #47788

Merged

Revise sort.md #47789

Closed

LilithHafner mentioned this pull request Jan 8, 2023

sortperm segfaults with lt=(>=) #48172

Closed

This was referenced Jan 11, 2023

Different RNG used during first gradient call on Julia 1.9 FluxML/Zygote.jl#1351

Open

Default sort algorithm affects global RNG #48230

Closed

LilithHafner mentioned this pull request Jan 11, 2023

stop using rand(lo:hi) for QuickerSort pivot selection #48241

Merged

LilithHafner mentioned this pull request Nov 3, 2023

Add BracketedSort a new, faster algorithm for partialsort and friends #52006

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize, optimize, and increase robustness of QuickSort #45222

Stabilize, optimize, and increase robustness of QuickSort #45222

LilithHafner commented May 7, 2022 •

edited

Loading

petvana commented May 8, 2022

LilithHafner commented May 8, 2022

LilithHafner commented Jun 4, 2022

LilithHafner commented Jun 7, 2022

mikmoore Jun 14, 2022

oscardssmith Jun 14, 2022

LilithHafner Jun 14, 2022

LilithHafner Jun 14, 2022

gaurav-arya Jan 11, 2023

LilithHafner Jan 11, 2023

gaurav-arya Jan 11, 2023 •

edited

Loading

gaurav-arya Jan 11, 2023

LilithHafner Jan 11, 2023

gaurav-arya Jan 11, 2023

LilithHafner commented Oct 14, 2022

oscardssmith Oct 14, 2022

LilithHafner Oct 15, 2022

LilithHafner Oct 15, 2022

LilithHafner Oct 15, 2022

oscardssmith left a comment

StefanKarpinski commented Oct 14, 2022

LilithHafner commented Oct 25, 2022

Stabilize, optimize, and increase robustness of QuickSort #45222

Stabilize, optimize, and increase robustness of QuickSort #45222

Conversation

LilithHafner commented May 7, 2022 • edited Loading

petvana commented May 8, 2022

LilithHafner commented May 8, 2022

LilithHafner commented Jun 4, 2022

LilithHafner commented Jun 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaurav-arya Jan 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LilithHafner commented Oct 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oscardssmith left a comment

Choose a reason for hiding this comment

StefanKarpinski commented Oct 14, 2022

LilithHafner commented Oct 25, 2022

LilithHafner commented May 7, 2022 •

edited

Loading

gaurav-arya Jan 11, 2023 •

edited

Loading