Skip to content

Commit

Permalink
Improve documentation of sort-related functions
Browse files Browse the repository at this point in the history
In particular:

* document the `order` keyword in `sort!`
* list explicitly the required properties of `lt` in `sort!`
* clarify the sequence of "by" transformations if both `by` and `order` are given
* show default values in the signatures for `searchsorted` and related functions
* add `isunordered` to the manual (it's already exported)
  • Loading branch information
knuesel committed Jan 23, 2023
1 parent 9f93b31 commit 7d3ff90
Show file tree
Hide file tree
Showing 5 changed files with 90 additions and 56 deletions.
10 changes: 5 additions & 5 deletions base/operators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -154,15 +154,17 @@ Values that are normally unordered, such as `NaN`,
are ordered after regular values.
[`missing`](@ref) values are ordered last.
This is the default comparison used by [`sort`](@ref).
This is the default comparison used by [`sort!`](@ref).
# Implementation
Non-numeric types with a total order should implement this function.
Numeric types only need to implement it if they have special values such as `NaN`.
Types with a partial order should implement [`<`](@ref).
See the documentation on [Alternate orderings](@ref) for how to define alternate
See the documentation on [Alternate Orderings](@ref) for how to define alternate
ordering methods that can be used in sorting and related functions.
See also [`isequal`](@ref), [`isunordered`](@ref).
# Examples
```jldoctest
julia> isless(1, 3)
Expand Down Expand Up @@ -1344,7 +1346,7 @@ corresponding position in `collection`. To get a vector indicating whether each
in `items` is in `collection`, wrap `collection` in a tuple or a `Ref` like this:
`in.(items, Ref(collection))` or `items .∈ Ref(collection)`.
See also: [`∉`](@ref).
See also: [`∉`](@ref), [`insorted`](@ref), [`contains`](@ref), [`occursin`](@ref), [`issubset`](@ref).
# Examples
```jldoctest
Expand Down Expand Up @@ -1382,8 +1384,6 @@ julia> [1, 2] .∈ ([2, 3],)
0
1
```
See also: [`insorted`](@ref), [`contains`](@ref), [`occursin`](@ref), [`issubset`](@ref).
"""
in

Expand Down
131 changes: 82 additions & 49 deletions base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ end
issorted(v, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)
Test whether a vector is in sorted order. The `lt`, `by` and `rev` keywords modify what
order is considered to be sorted just as they do for [`sort`](@ref).
order is considered to be sorted just as they do for [`sort!`](@ref).
# Examples
```jldoctest
Expand Down Expand Up @@ -94,14 +94,17 @@ maybeview(v, k) = view(v, k)
maybeview(v, k::Integer) = v[k]

"""
partialsort!(v, k; by=<transform>, lt=<comparison>, rev=false)
partialsort!(v, k; by=identity, lt=isless, rev=false)
Partially sort the vector `v` in place, according to the order specified by `by`, `lt` and
`rev` so that the value at index `k` (or range of adjacent values if `k` is a range) occurs
Partially sort the vector `v` in place so that the value at index `k` (or
range of adjacent values if `k` is a range) occurs
at the position where it would appear if the array were fully sorted. If `k` is a single
index, that value is returned; if `k` is a range, an array of values at those indices is
returned. Note that `partialsort!` may not fully sort the input array.
For the keyword arguments, see the documentation of [`sort!`](@ref).
# Examples
```jldoctest
julia> a = [1, 2, 4, 3, 4]
Expand Down Expand Up @@ -148,9 +151,9 @@ partialsort!(v::AbstractVector, k::Union{Integer,OrdinalRange};
partialsort!(v, k, ord(lt,by,rev,order))

"""
partialsort(v, k, by=<transform>, lt=<comparison>, rev=false)
partialsort(v, k, by=identity, lt=isless, rev=false)
Variant of [`partialsort!`](@ref) which copies `v` before partially sorting it, thereby returning the
Variant of [`partialsort!`](@ref) that copies `v` before partially sorting it, thereby returning the
same thing as `partialsort!` but leaving `v` unmodified.
"""
partialsort(v::AbstractVector, k::Union{Integer,OrdinalRange}; kws...) =
Expand Down Expand Up @@ -288,14 +291,16 @@ for s in [:searchsortedfirst, :searchsortedlast, :searchsorted]
end

"""
searchsorted(a, x; by=<transform>, lt=<comparison>, rev=false)
searchsorted(a, x; by=identity, lt=isless, rev=false)
Return the range of indices of `a` which compare as equal to `x` (using binary search)
according to the order specified by the `by`, `lt` and `rev` keywords, assuming that `a`
is already sorted in that order. Return an empty range located at the insertion point
if `a` does not contain values equal to `x`.
Return the range of indices of `a` that compare as equal to `x` (using binary
search), assuming that `a` is already sorted. Return an empty range located at
the insertion point if `a` does not contain values equal to `x`.
See also: [`insorted`](@ref), [`searchsortedfirst`](@ref), [`sort`](@ref), [`findall`](@ref).
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation.
See also: [`insorted`](@ref), [`searchsortedfirst`](@ref), [`sort!`](@ref), [`findall`](@ref).
# Examples
```jldoctest
Expand All @@ -317,14 +322,17 @@ julia> searchsorted([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
""" searchsorted

"""
searchsortedfirst(a, x; by=<transform>, lt=<comparison>, rev=false)
searchsortedfirst(a, x; by=identity, lt=isless, rev=false)
Return the index of the first value in `a` greater than or equal to `x`, according to the
specified order. Return `lastindex(a) + 1` if `x` is greater than all values in `a`.
`a` is assumed to be sorted.
Return the index of the first value in `a` greater than or equal to `x`,
assuming that `a` is already sorted. Return `lastindex(a) + 1` if `x` is
greater than all values in `a`.
`insert!`ing `x` at this index will maintain sorted order.
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation.
See also: [`searchsortedlast`](@ref), [`searchsorted`](@ref), [`findfirst`](@ref).
# Examples
Expand All @@ -347,11 +355,12 @@ julia> searchsortedfirst([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
""" searchsortedfirst

"""
searchsortedlast(a, x; by=<transform>, lt=<comparison>, rev=false)
searchsortedlast(a, x; by=identity, lt=isless, rev=false)
Return the index of the last value in `a` less than or equal to `x`, according to the
specified order. Return `firstindex(a) - 1` if `x` is less than all values in `a`. `a` is
assumed to be sorted.
Return the index of the last value in `a` less than or equal to `x`, assuming
that `a` is already sorted. Return `firstindex(a) - 1` if `x` is less than all
values in `a`. The `by`, `lt` and `rev` keywords modify what order is assumed
for the data, as described in the [`sort!`](@ref) documentation.
# Examples
```jldoctest
Expand All @@ -373,12 +382,12 @@ julia> searchsortedlast([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
""" searchsortedlast

"""
insorted(x, a; by=<transform>, lt=<comparison>, rev=false) -> Bool
insorted(x, a; by=identity, lt=isless, rev=false) -> Bool
Determine whether an item `x` is in the sorted collection `a`, in the sense that
it is [`==`](@ref) to one of the values of the collection according to the order
specified by the `by`, `lt` and `rev` keywords, assuming that `a` is already
sorted in that order, see [`sort`](@ref) for the keywords.
it is [`==`](@ref) to one of the values of the collection. The `by`, `lt` and
`rev` keywords modify what order is assumed for the collection, as described in
the [`sort!`](@ref) documentation.
See also [`in`](@ref).
Expand Down Expand Up @@ -524,7 +533,7 @@ Base.size(v::WithoutMissingVector) = size(v.data)
send_to_end!(f::Function, v::AbstractVector; [lo, hi])
Send every element of `v` for which `f` returns `true` to the end of the vector and return
the index of the last element which for which `f` returns `false`.
the index of the last element for which `f` returns `false`.
`send_to_end!(f, v, lo, hi)` is equivalent to `send_to_end!(f, view(v, lo:hi))+lo-1`
Expand Down Expand Up @@ -724,8 +733,8 @@ Insertion sort traverses the collection one element at a time, inserting
each element into its correct, sorted position in the output vector.
Characteristics:
* *stable*: preserves the ordering of elements which compare equal
(e.g. "a" and "A" in a sort of letters which ignores case).
* *stable*: preserves the ordering of elements that compare equal
(e.g. "a" and "A" in a sort of letters that ignores case).
* *in-place* in memory.
* *quadratic performance* in the number of elements to be sorted:
it is well-suited to small collections but should not be used for large ones.
Expand Down Expand Up @@ -965,8 +974,8 @@ is treated as the first or last index of the input, respectively.
`lo` and `hi` may be specified together as an `AbstractUnitRange`.
Characteristics:
* *stable*: preserves the ordering of elements which compare equal
(e.g. "a" and "A" in a sort of letters which ignores case).
* *stable*: preserves the ordering of elements that compare equal
(e.g. "a" and "A" in a sort of letters that ignores case).
* *not in-place* in memory.
* *divide-and-conquer*: sort strategy similar to [`QuickSort`](@ref).
* *linear runtime* if `length(lo:hi)` is constant
Expand Down Expand Up @@ -1242,7 +1251,7 @@ Otherwise, we dispatch to [`InsertionSort`](@ref) for inputs with `length <= 40`
perform a presorted check ([`CheckSorted`](@ref)).
We check for short inputs before performing the presorted check to avoid the overhead of the
check for small inputs. Because the alternate dispatch is to [`InseritonSort`](@ref) which
check for small inputs. Because the alternate dispatch is to [`InsertionSort`](@ref) which
has efficient `O(n)` runtime on presorted inputs, the check is not necessary for small
inputs.
Expand Down Expand Up @@ -1323,15 +1332,31 @@ defalg(v::AbstractArray{Union{}}) = DEFAULT_UNSTABLE # for method disambiguation
"""
sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)
Sort the vector `v` in place. A stable algorithm is used by default. You can select a
specific algorithm to use via the `alg` keyword (see [Sorting Algorithms](@ref) for
available algorithms). The `by` keyword lets you provide a function that will be applied to
each element before comparison; the `lt` keyword allows providing a custom "less than"
function (note that for every `x` and `y`, only one of `lt(x,y)` and `lt(y,x)` can return
`true`); use `rev=true` to reverse the sorting order. These options are independent and can
be used together in all possible combinations: if both `by` and `lt` are specified, the `lt`
function is applied to the result of the `by` function; `rev=true` reverses whatever
ordering specified via the `by` and `lt` keywords.
Sort the vector `v` in place. A stable algorithm is used by default. A specific
algorithm can be selected via the `alg` keyword (see [Sorting Algorithms](@ref)
for available algorithms). Elements are first transformed by the function `by`
and then compared according to either the function `lt` or the ordering
`order`. Finally, the resulting order is reversed if `rev=true`.
The `lt` function should define a strict partial order, that is, it should be
- irreflexive: `lt(x, x)` always yields `false`,
- asymmetric: if `lt(x, y)` yields `true` then `lt(y, x)` yields `false`,
- transitive: `lt(x, y) && lt(y, z)` implies `lt(x, z)`.
For example `<` is a valid `lt` function but `≤` is not.
Passing an `lt` other than `isless` along with an `order` other than
[`Base.Order.Forward`](@ref) or [`Base.Order.Reverse`](@ref) is not permitted,
otherwise all options are independent and can be used together in all possible
combinations. Note that `order` can also include a "by" transformation, in
which case it is applied after that defined with the `by` keyword. For more
information on `order` values see the documentation on [Alternate
Orderings](@ref).
See also [`sort`](@ref), [`sortperm`](@ref), [`sortslices`](@ref),
[`partialsort!`](@ref), [`partialsortperm`](@ref), [`issorted`](@ref),
[`searchsorted`](@ref), [`insorted`](@ref), [`Base.Order.ord`](@ref).
# Examples
```jldoctest
Expand All @@ -1358,6 +1383,13 @@ julia> v = [(1, "c"), (3, "a"), (2, "b")]; sort!(v, by = x -> x[2]); v
(3, "a")
(2, "b")
(1, "c")
julia> sort(0:3, by=x->x-2, order=Base.Order.By(abs)) # same as sort(0:3, by=abs(x->x-2))
4-element Vector{Int64}:
2
1
3
0
```
"""
function sort!(v::AbstractVector{T};
Expand Down Expand Up @@ -1398,15 +1430,15 @@ sort(v::AbstractVector; kws...) = sort!(copymutable(v); kws...)
## partialsortperm: the permutation to sort the first k elements of an array ##

"""
partialsortperm(v, k; by=<transform>, lt=<comparison>, rev=false)
partialsortperm(v, k; by=ientity, lt=isless, rev=false)
Return a partial permutation `I` of the vector `v`, so that `v[I]` returns values of a fully
sorted version of `v` at index `k`. If `k` is a range, a vector of indices is returned; if
`k` is an integer, a single index is returned. The order is specified using the same
keywords as `sort!`. The permutation is stable, meaning that indices of equal elements
appear in ascending order.
Note that this function is equivalent to, but more efficient than, calling `sortperm(...)[k]`.
This function is equivalent to, but more efficient than, calling `sortperm(...)[k]`.
# Examples
```jldoctest
Expand All @@ -1432,7 +1464,7 @@ partialsortperm(v::AbstractVector, k::Union{Integer,OrdinalRange}; kwargs...) =
partialsortperm!(similar(Vector{eltype(k)}, axes(v,1)), v, k; kwargs...)

"""
partialsortperm!(ix, v, k; by=<transform>, lt=<comparison>, rev=false)
partialsortperm!(ix, v, k; by=identity, lt=isless, rev=false)
Like [`partialsortperm`](@ref), but accepts a preallocated index vector `ix` the same size as
`v`, which is used to store (a permutation of) the indices of `v`.
Expand Down Expand Up @@ -1732,7 +1764,8 @@ end
sort!(A; dims::Integer, alg::Algorithm=defalg(A), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)
Sort the multidimensional array `A` along dimension `dims`.
See [`sort!`](@ref) for a description of possible keyword arguments.
See the vector version of [`sort!`](@ref) for a description of possible keyword
arguments.
To sort slices of an array, refer to [`sortslices`](@ref).
Expand Down Expand Up @@ -1886,8 +1919,8 @@ algorithm. Partial quick sort returns the smallest `k` elements sorted from smal
to largest, finding them and sorting them using [`QuickSort`](@ref).
Characteristics:
* *not stable*: does not preserve the ordering of elements which
compare equal (e.g. "a" and "A" in a sort of letters which
* *not stable*: does not preserve the ordering of elements that
compare equal (e.g. "a" and "A" in a sort of letters that
ignores case).
* *in-place* in memory.
* *divide-and-conquer*: sort strategy similar to [`MergeSort`](@ref).
Expand All @@ -1903,8 +1936,8 @@ Indicate that a sorting function should use the quick sort
algorithm, which is *not* stable.
Characteristics:
* *not stable*: does not preserve the ordering of elements which
compare equal (e.g. "a" and "A" in a sort of letters which
* *not stable*: does not preserve the ordering of elements that
compare equal (e.g. "a" and "A" in a sort of letters that
ignores case).
* *in-place* in memory.
* *divide-and-conquer*: sort strategy similar to [`MergeSort`](@ref).
Expand All @@ -1922,8 +1955,8 @@ subcollection at each step, until the entire
collection has been recombined in sorted form.
Characteristics:
* *stable*: preserves the ordering of elements which compare
equal (e.g. "a" and "A" in a sort of letters which ignores
* *stable*: preserves the ordering of elements that compare
equal (e.g. "a" and "A" in a sort of letters that ignores
case).
* *not in-place* in memory.
* *divide-and-conquer* sort strategy.
Expand Down
1 change: 1 addition & 0 deletions doc/src/base/base.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ Core.:(===)
Core.isa
Base.isequal
Base.isless
Base.isunordered
Base.ifelse
Core.typeassert
Core.typeof
Expand Down
2 changes: 1 addition & 1 deletion doc/src/base/sort.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = Inline
The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed
to be stable since Julia 1.9. Previous versions had unstable edge cases when sorting numeric arrays.

## Alternate orderings
## Alternate Orderings

By default, `sort` and related functions use [`isless`](@ref) to compare two
elements in order to determine which should come first. The
Expand Down
2 changes: 1 addition & 1 deletion doc/src/manual/missing.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ true
```

The [`isless`](@ref) operator is another exception: `missing` is considered
as greater than any other value. This operator is used by [`sort`](@ref),
as greater than any other value. This operator is used by [`sort!`](@ref),
which therefore places `missing` values after all other values:

```jldoctest
Expand Down

0 comments on commit 7d3ff90

Please sign in to comment.