-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add insorted
for in
function in sorted arrays
#37490
Conversation
insorted
for in function in sorted arraysinsorted
for in
function in sorted arrays
Thanks a lot! It would be nice to have such a function! This PR adds a lot of code though, maybe that can be reduced.
function insorted(x, coll; kw...)
!isempty(searchsorted(coll, x; kw...))
end
|
I'll report this possibility on benchmarks. It should be close because intuitively
To be honnest I didn't check the source code of |
Awesome!
It of depends on the collection, but usually either a collection does not implement
The main reason is that a code that uses
Yes they are. At least by the standard |
Julia 1.0.5: using Base.Order
midpoint(lo::T, hi::T) where T<:Integer = lo + ((hi - lo) >>> 0x01)
midpoint(lo::Integer, hi::Integer) = midpoint(promote(lo, hi)...)
function _insortedfirst(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::Bool where T<:Integer
if lt(o, v[hi], x)
return false
end
u = T(1)
lo = lo - u
hi = hi + u
@inbounds while lo < hi - u
m = midpoint(lo, hi)
if lt(o, v[m], x)
lo = m
else
hi = m
end
end
if v[hi] == x
return true
end
return false
end
function _insortedlast(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::Bool where T<:Integer
if lt(o, x, v[lo])
return false
end
u = T(1)
lo = lo - u
hi = hi + u
@inbounds while lo < hi - u
m = midpoint(lo, hi)
if lt(o, x, v[m])
hi = m
else
lo = m
end
end
if v[lo] == x
return true
end
return false
end
function insorted(v::AbstractVector, x, ilo::T, ihi::T, o::Ordering)::Bool where T<:Integer
if isempty(v)
return false
elseif lt(o, v[ihi], x)
return false
end
u = T(1)
lo = ilo - u
hi = ihi + u
@inbounds while lo < hi - u
m = midpoint(lo, hi)
if lt(o, v[m], x)
lo = m
elseif lt(o, x, v[m])
hi = m
else
return _insortedfirst(v, x, max(lo,ilo), m, o) || _insortedlast(v, x, m, min(hi,ihi), o)
end
end
if hi <= ihi && v[hi] == x
return true
end
return false
end
function other_insorted(v::AbstractVector, x, ilo::T, ihi::T, o::Ordering)::Bool where T<:Integer
r = searchsorted(v, x)
return length(r) > 0
end
for s in [:insorted, :other_insorted]
@eval begin
$s(v::AbstractVector, x, o::Ordering) = (inds = axes(v, 1); $s(v,x,first(inds),last(inds),o))
$s(v::AbstractVector, x;
lt=isless, by=identity, rev::Union{Bool,Nothing}=nothing, order::Ordering=Forward) =
$s(v,x,ord(lt,by,rev,order))
end
end
using BenchmarkTools
julia> @benchmark insorted([1:5; 1:5; 1:5], 1, 6, 10, Forward)
BenchmarkTools.Trial:
memory estimate: 208 bytes
allocs estimate: 1
--------------
minimum time: 66.106 ns (0.00% GC)
median time: 67.544 ns (0.00% GC)
mean time: 80.966 ns (11.95% GC)
maximum time: 47.349 μs (99.72% GC)
--------------
samples: 10000
evals/sample: 979
julia> @benchmark other_insorted([1:5; 1:5; 1:5], 1, 6, 10, Forward)
BenchmarkTools.Trial:
memory estimate: 208 bytes
allocs estimate: 1
--------------
minimum time: 70.034 ns (0.00% GC)
median time: 70.964 ns (0.00% GC)
mean time: 84.005 ns (11.40% GC)
maximum time: 46.641 μs (99.75% GC)
--------------
samples: 10000
evals/sample: 976
julia> @benchmark insorted([1,2,3], 0)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 45.492 ns (0.00% GC)
median time: 46.430 ns (0.00% GC)
mean time: 58.798 ns (17.14% GC)
maximum time: 47.850 μs (99.82% GC)
--------------
samples: 10000
evals/sample: 990
julia> @benchmark other_insorted([1,2,3], 0)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 45.584 ns (0.00% GC)
median time: 46.561 ns (0.00% GC)
mean time: 59.128 ns (17.20% GC)
maximum time: 46.606 μs (99.75% GC)
--------------
samples: 10000
evals/sample: 990
julia> @benchmark insorted([1,2,3], 2)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 47.190 ns (0.00% GC)
median time: 48.155 ns (0.00% GC)
mean time: 62.074 ns (17.19% GC)
maximum time: 48.128 μs (99.82% GC)
--------------
samples: 10000
evals/sample: 988
julia> @benchmark other_insorted([1,2,3], 2)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 48.591 ns (0.00% GC)
median time: 49.597 ns (0.00% GC)
mean time: 62.978 ns (17.44% GC)
maximum time: 50.001 μs (99.83% GC)
--------------
samples: 10000
evals/sample: 988
julia> @benchmark insorted([1,2,3], 4)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 43.004 ns (0.00% GC)
median time: 43.998 ns (0.00% GC)
mean time: 57.017 ns (19.12% GC)
maximum time: 47.788 μs (99.85% GC)
--------------
samples: 10000
evals/sample: 991
julia> @benchmark other_insorted([1,2,3], 4)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 1
--------------
minimum time: 45.670 ns (0.00% GC)
median time: 46.598 ns (0.00% GC)
mean time: 59.793 ns (18.05% GC)
maximum time: 46.788 μs (99.80% GC)
--------------
samples: 10000
evals/sample: 990 Using a dedicated I checked, for ranges, |
I think this PR introduces a lot of code, not all of which is really needed. So I would approach it like this. Take a step back and replace all the code by just: function insorted(x, coll; kw...)
!isempty(searchsorted(coll, x; kw...))
end Is it missing features? Is it slow in some cases? Is it buggy? |
Ok, here are some julia> @benchmark insorted(1, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 12.659 ns (0.00% GC)
median time: 13.394 ns (0.00% GC)
mean time: 14.758 ns (0.00% GC)
maximum time: 57.431 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(1, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.641 ns (0.00% GC)
median time: 30.255 ns (0.00% GC)
mean time: 19.357 ns (0.00% GC)
maximum time: 68.061 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark insorted(100, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 10.492 ns (0.00% GC)
median time: 10.973 ns (0.00% GC)
mean time: 12.726 ns (0.00% GC)
maximum time: 43.886 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(100, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 34.232 ns (0.00% GC)
median time: 43.665 ns (0.00% GC)
mean time: 44.449 ns (0.00% GC)
maximum time: 101.706 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 994
julia> @benchmark insorted(30, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 11.148 ns (0.00% GC)
median time: 15.348 ns (0.00% GC)
mean time: 15.536 ns (0.00% GC)
maximum time: 37.629 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(30, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.726 ns (0.00% GC)
median time: 30.250 ns (0.00% GC)
mean time: 22.406 ns (0.00% GC)
maximum time: 57.277 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark insorted(60, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 11.010 ns (0.00% GC)
median time: 14.994 ns (0.00% GC)
mean time: 15.677 ns (0.00% GC)
maximum time: 43.850 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(60, coll) setup=(coll=sort(rand(1:100, 50)))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 11.078 ns (0.00% GC)
median time: 44.064 ns (0.00% GC)
mean time: 39.295 ns (0.00% GC)
maximum time: 77.879 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 998 For ranges it is slower: julia> @benchmark insorted(1, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 9.176 ns (0.00% GC)
median time: 9.247 ns (0.00% GC)
mean time: 9.483 ns (0.00% GC)
maximum time: 24.069 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(1, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.208 ns (0.00% GC)
median time: 2.217 ns (0.00% GC)
mean time: 2.225 ns (0.00% GC)
maximum time: 12.624 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark insorted(100, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 8.752 ns (0.00% GC)
median time: 10.023 ns (0.00% GC)
mean time: 10.087 ns (0.00% GC)
maximum time: 38.033 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(100, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.208 ns (0.00% GC)
median time: 2.219 ns (0.00% GC)
mean time: 2.236 ns (0.00% GC)
maximum time: 34.658 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark insorted(50, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 8.745 ns (0.00% GC)
median time: 9.210 ns (0.00% GC)
mean time: 9.273 ns (0.00% GC)
maximum time: 37.059 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark in(50, coll) setup=(coll=rand(1:100):rand(1:100))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.207 ns (0.00% GC)
median time: 2.218 ns (0.00% GC)
mean time: 2.225 ns (0.00% GC)
maximum time: 30.371 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000 I suggest to keep |
Thanks for taking a look at ranges. I think for arrays """
some docs....
"""
function insorted end
insorted(x, coll; kw...) = !isempty(searchsorted(coll, x; kw...))
insorted(x, coll::AbstractRange; kw...) = in(x, coll) |
I just pushed the last changes that roughly correspond to what you suggested 👍🏼 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Needs also an entry in News.md
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks a lot @remi-garcia !
I just removed tests that called the "old" insorted(x,v,lo,hi,o)` and fixed a typo. Should be working fine now! |
I don't get why the buildbot is so "mad" at me. Is that normal or I'm missing something I should fix? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can click on the "details" button behind the individual CI tests, and check the stack trace of error messages, just like in the normal REPL. Then you can start fixing bugs as usual. Some more style comments on the tests: for functions that return Bool
s and sound like "a test" already (insorted
), you don't need to compare to true
or false
, but can simply do @test insorted(...)
and @test !insorted(...)
, resp.
Thanks for the correction and suggestions. I just understood that I had to click on |
base/operators.jl
Outdated
@@ -1063,7 +1063,8 @@ splat(f) = args->f(args...) | |||
in(x) | |||
|
|||
Create a function that checks whether its argument is [`in`](@ref) `x`, i.e. | |||
a function equivalent to `y -> y in x`. | |||
a function equivalent to `y -> y in x`. See also [`insorted`](@ref) for the use | |||
in sorted collections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in sorted collections. | |
with sorted collections. |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment, maybe a native speaker could quickly help, and then we merge this.
Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
Is there any precedent for this function in other systems? If so, I wonder what it has been called before? We should make sure to use a standard name if one exists. |
I found binary_search or bsearch. I wanted to be consistent with |
I like |
There has been some discussion in the past about creating a separate More generally, a simple
For a more pie-in-the sky idea, it would be nice to have a way of tagging arrays with attributes (like "sorted"):
Related: @andyferris's AcceleratedArrays.jl |
@goretkin had basically the same suggestion in the issue fixed by this PR |
Closes #37442
The purpose is to add a function that determine whether an item is in the given sorted collection.
EDIT: Usage:
Implemented in a similar way as
searchsorted
. Left a TODO ininsorted(a::AbstractRange{<:Real}, x::Real, o::DirectOrdering)::Bool