Fix in(x, r::Range{<:Integer}) when x is not numeric #21728

nalimilan · 2017-05-06T14:20:38Z

Previously, checking that a string was in() a container would work for Array
but not for Range.

Another (possibly cleaner) approach would be to restrict the current definitions to x::Number, and introduce fallbacks returning false.

Sacha0

Another (possibly cleaner) approach would be to restrict the current definitions to x::Number, and introduce fallbacks returning false.

The alternative you propose indeed sounds cleaner and clearer :). (Knowing nothing about applicable, I wonder whether the added applicable calls and associated branches impact performance or are optimized away?) Best!

nalimilan · 2017-05-06T16:18:08Z

The alternative you propose indeed sounds cleaner and clearer :).

Yeah, maybe. I was hesitant because of the possibility of non-Number types supporting ranges, like Date. Not sure.

(Knowing nothing about applicable, I wonder whether the added applicable calls and associated branches impact performance or are optimized away?) Best!

It should be resolved at compile time like method_exists, so there should be no runtime cost.

TotalVerb

I also like the alternative better than this. This seems brittle for whatever reason.

Sacha0 · 2017-05-07T16:29:18Z

base/range.jl

@@ -861,13 +861,15 @@ end
 median(r::Range{<:Real}) = mean(r)

 function in(x, r::Range)
+    applicable(-, x, first(r)) || return false
    n = step(r) == 0 ? 1 : round(Integer,(x-first(r))/step(r))+1


Tangentially, does this branch potentially introduce type instability?

Why? It doesn't define any variable, and makes the function essentially equivalent to return false when the condition is false.

The (preexisting) branch in the n = ... line, rather than the new applicable(... line :). To expand, down one path n = 1, whereas down the other n = round(Integer, .... If the round yields some non-Int Integer, n's type would depend on the branch? Best!

Mmm. You're right that when the condition is true, the rest of the body isn't optimized out as I would have expected. So n still exists and is inferred as Any; though the return value is correctly inferred as Bool.

More fundamentally, applicable/method_exists does not appear to be resolved at compile time. That's because #16422 hasn't been merged (I had forgotten about this). So I have no solution for now...

Sacha0 · 2017-05-07T16:33:14Z

I was hesitant because of the possibility of non-Number types supporting ranges, like Date.

Agreed, tricky that. I suppose this method is another instance where the ability to dispatch on provision of the necessary arithmetic operations would be useful.

nalimilan · 2017-05-07T16:48:55Z

I think this illustrates the fact that the interfaces are not defined very clearly here. Ranges can be defined on any type which supports addition and subtraction. The ideal solution for this would be a trait.

I would be happy to use the cleaner Number approach, but that will force anybody willing to support ranges with a non-Number custom type to copy the very same definition of in, but with a different signature. Is that really what we want?

Sacha0 · 2017-05-07T17:15:53Z

I would be happy to use the cleaner Number approach, but that will force anybody willing to support ranges with a non-Number custom type to copy the very same definition of in, but with a different signature. Is that really what we want?

Perhaps for now keep the applicable(-, ... in the first method definition (where traits would be ideal), and in the second method definition (presently with applicable(isinteger, ...) use dispatch? (In the second method definition I see no disadvantage to using dispatch, given that checking applicable(isinteger, ... should be equivalent to dispatching on x::Integer?)

nalimilan · 2017-05-07T17:19:44Z

Yes, for the second definition dispatch would work equally well. The first one is more tricky (especially if we can't get applicable to be resolved at compile time).

Sacha0 · 2017-05-07T17:29:30Z

Yes, for the second definition dispatch would work equally well. The first one is more tricky (especially if we can't get applicable to be resolved at compile time).

In that case, perhaps for now nix the change to the first method definition, use dispatch in the second, merge the result, and revisit the first method definition when relevant capabilities advance? :)

JeffBezanson · 2017-05-09T20:25:22Z

How did this come up? We do just give errors in some cases like this, for example searchsortedfirst([1,2,3], "") is similar.

applicable(isinteger, ... should be equivalent to dispatching on x::Integer

isinteger(2.0) is also true, so should probably dispatch on ::Number.

nalimilan · 2017-05-09T20:46:19Z

How did this come up? We do just give errors in some cases like this, for example searchsortedfirst([1,2,3], "") is similar.

I discovered this when making a mistake (as often with corner cases like this). It's not a major issue, but I find it hard to justify the inconsistency with Array. A relatively credible use case would be this:

julia> Nullable() in [1, 2]
false

julia> Nullable() in 1:2
ERROR: MethodError: no method matching isinteger(::Nullable{Union{}})

This would become even more credible if we replace Nullable with Union{T, Null}, with == defined so that null in [1, 2, null] would be true (currently that's a NullException).

nalimilan · 2017-05-09T20:53:48Z

I've dropped the problematic change for now. Though I still would be interested in suggestions about how to fix this for the non-integer case.

JeffBezanson · 2017-05-09T20:58:52Z

Makes sense. Yes, it's a good question. Maybe we could dispatch as:

in(x::Union{Number,T}, r::Range{T}) where T

nalimilan · 2017-05-09T21:19:06Z

But that wouldn't change anything for "a" in 1:3, right?

JeffBezanson · 2017-05-09T21:28:31Z

If that definition doesn't match, we'd fall back to iterating over the range and comparing every element, or we could add a fallback that returns false.

nalimilan · 2017-05-10T15:14:47Z

Clever. Indeed with #21256 the fallback based on any should be optimized at compile time if == isn't defined for a pair of immutable types (and falls back to ===).

Looking at this in more detail, it turns out we need to restrict the signature to Real, not Number, since isinteger(Complex(1, 0)) == true and yet < isn't supported. So I've added a specific method for Complex too. Also, the in(x::Union{Number,T}, r::Range{T}) where T trick doesn't work since -(::T, Number) is not always implemented, only -(::T, ::T) may be supported (e.g. for Date).

At least now the abstraction is correctly respected: the fallback is used for types for which we're not sure an optimized computation is possible given the interface.

Sacha0 · 2017-05-10T15:37:05Z

base/range.jl

    n = step(r) == 0 ? 1 : round(Integer,(x-first(r))/step(r))+1
    n >= 1 && n <= length(r) && r[n] == x
 end

 in(x::Integer, r::AbstractUnitRange{<:Integer}) = (first(r) <= x) & (x <= last(r))
-in(x, r::Range{T}) where {T<:Integer} =
+in(x::Real, r::Range{T}) where {T<:Integer} =


Could this signature simplify to in(x::Real, r::Range{<:Integer})? (Whoops, missed the usage of T in the method body, though that usage could instead be an eltype.)

Yes, I kept it because T is used.

Sacha0 · 2017-05-10T15:39:53Z

base/range.jl

@@ -860,13 +860,19 @@ end

 median(r::Range{<:Real}) = mean(r)

-function in(x, r::Range)
+function in(x::Real, r::Range{<:Real})
+    n = step(r) == 0 ? 1 : round(Integer,(x-first(r))/step(r))+1


Perhaps create a helper function to avoid replication of the body of this / the below method?

Not sure, are these two lines worth the additional complexity?

A single implementation in a helper function strikes me as less complex :). Moreover, having a single implementation guards against changes being made in one place but not the other.

Given that the definitions are so close, I doubt it. Also it turns out it's harder than it seems, since their signatures are different: T doesn't need to be the same for both arguments with the Real method.

Given that the definitions are so close, I doubt it.

Even in the relatively short time I've been reviewing actively, IIRC I've seen such happen at least once :).

Also it turns out it's harder than it seems, since their signatures are different: T doesn't need to be the same for both arguments with the Real method.

Not certain I follow? Would e.g. the following not work?

in(x::Real, r::Range{<:Real}) = _therecanbeonlyone(x, r) in(x::T, r::Range{<:T}) where {T} = _therecanbeonlyone(x, r) function _therecanbeonlyone(x, r) n = step(r) == 0 ? 1 : round(Integer, (x - first(r))/step(r)) + 1 n >= 1 && n <= length(r) && r[n] == x end

Best! :)

Of course! I don't know why I wanted to use a loop with @eval...

Sacha0 · 2017-05-10T15:41:30Z

base/complex.jl

@@ -178,6 +178,10 @@ bswap(z::Complex) = Complex(bswap(real(z)), bswap(imag(z)))

 isequal(z::Complex, w::Complex) = isequal(real(z),real(w)) & isequal(imag(z),imag(w))

+in(x::Complex, r::Range{T}) where {T<:Integer} =


Could this signature simplify to in(x::Complex, r::Range{<:Integer})? (Please ignore this comment if T turns out necessary in the method body.)

Sacha0 · 2017-05-10T15:43:51Z

base/complex.jl

@@ -178,6 +178,10 @@ bswap(z::Complex) = Complex(bswap(real(z)), bswap(imag(z)))

 isequal(z::Complex, w::Complex) = isequal(real(z),real(w)) & isequal(imag(z),imag(w))

+in(x::Complex, r::Range{T}) where {T<:Integer} =
+    isinteger(x) && !isempty(r) && real(x) >= minimum(r) && real(x) <= maximum(r) &&
+        (mod(real(x), step(r)) - mod(first(r), step(r)) == 0)


The equivalent method below for x::Real converts x to type T prior to the mod. Should that be done here as well?

Good question. I'll add it back since that must have been added for a reason (maybe to prevent overflow), but I don't know.

Perhaps in case only mod(::T, ::T) (rather than also mod(::typeof(x), ::T)) exists?

Sacha0 · 2017-05-10T15:49:20Z

Could the separate methods for z::Complex and x::Real not be fused into a single method for x::Number that includes the real calls from the z::Complex method? For x::Real, those calls should be noops? Best!

nalimilan · 2017-05-10T15:57:32Z

Could the separate methods for z::Complex and x::Real not be fused into a single method for x::Number that includes the real calls from the z::Complex method? For x::Real, those calls should be noops? Best!

Unfortunately, Complex isn't defined at that point in the bootstrap.

Sacha0 · 2017-05-10T16:29:07Z

Unfortunately, Complex isn't defined at that point in the bootstrap.

Hm, pity that. DRYing those methods would be nice, as evidenced by the inadvertent conversion discrepancy creeping in above. (Is there a reasonable place to touch real prior to range.jl in bootstrap?)

nalimilan · 2017-05-10T16:36:24Z

base/range.jl

+end
+# This method needs to be defined separately since only -(::T, ::T) needs to be supported
+# to create a range, but not necessarily -(::T, ::Real)
+function in(x::T, r::Range{<:T}) where {T}


It turns out that method generates ambiguities with other methods defined in this file if I change <:T to T (as it should be). So we need to decide what choice is the most painful: requiring people to define this method themselves when they implement a custom non-Real type supporting ranges; or define this generic fallback here (current state of the PR), but require people to fix ambiguities when they need to define a custom method like this one.

I've changed <:T to T and it doesn't seem to generate ambiguities now, so let's go with that.

nalimilan · 2017-05-11T08:43:36Z

Actually I found a very simple way to DRY the Complex definition: define it as in(x::Complex, r::Range{<:Real}) = isreal(x) && real(x) in r. This is even better than the previous one, which was limited to integers.

So the only remaining issue is the ambiguities in(x::T, r::Range{T}) introduces.

Sacha0 · 2017-05-11T15:40:35Z

base/complex.jl

@@ -178,6 +178,8 @@ bswap(z::Complex) = Complex(bswap(real(z)), bswap(imag(z)))

 isequal(z::Complex, w::Complex) = isequal(real(z),real(w)) & isequal(imag(z),imag(w))

+in(x::Complex, r::Range{<:Real}) = isreal(x) && real(x) in r


Beautiful! :)

Sacha0 · 2017-05-11T15:44:52Z

base/range.jl

@@ -860,13 +860,18 @@ end

 median(r::Range{<:Real}) = mean(r)

-function in(x, r::Range)
+function _in_range(x, r::Range)
    n = step(r) == 0 ? 1 : round(Integer,(x-first(r))/step(r))+1


The tangential discussion re. the potential type instability in this line got buried above. Any thoughts re. addressing this type instability while you're performing surgery? :)

Ah, right. I'll fix that once we have sorted out the other issues since the discussion is already long. I wonder what's the best way of fixing this. Maybe compute the second operand unconditionally and call one on that? That would also avoid a branch if we use ifelse.

Maybe compute the second operand unconditionally and call one on that?

I imagine the branch exists to avoid the potential division by zero (step(r)) in computing the second operand (and potential downstream issues)?

Perhaps continuing to branch on step(r) == 0, but eliminating n in the step(r) == 0 case would work well? For example, perhaps something along the lines of

function _in_range(x, r::Range) if step(r) == 0 isempty(r) ? false : first(r) == x else n = round(Integer, (x - first(r)) / step(r)) + 1 n >= 1 && n <= length(r) && r[n] == x end end

Thoughts?

I've added a commit doing this.

nalimilan · 2017-05-20T16:49:53Z

base/range.jl

+in(x::Real, r::Range{<:Real}) = _in_range(x, r)
+# This method needs to be defined separately since -(::T, ::T) can be implemented
+# even if -(::T, ::Real) is not
+in(x::T, r::Range{<:T}) where {T} = _in_range(x, r)


@JeffBezanson What do you think about using Range{<:T} rather than Range{T} just to prevent ambiguities here (see my previous comment)? This feels like a hack to me and I'd be inclined to remove that method even if that forces people implementing non-Real types which support ranges to define it manually.

nalimilan · 2017-06-15T07:16:26Z

@Sacha0 Good to go now?

EDIT: if somebody merges this for me, better not squash the commits since the second one is not really related to the PR.

Sacha0 · 2017-06-15T17:02:10Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

Sacha0

lgtm modulo nanosoldier's signoff! :)

nanosoldier · 2017-06-15T20:01:02Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

nalimilan · 2017-06-15T20:15:26Z

The only regression is in broadcast, and it doesn't seem related (though it's hard to tell).

Sacha0 · 2017-06-15T20:33:24Z

Imagine so, and @nanosoldier runbenchmarks("broadcast" && "dotop" , vs = ":master") can verify :).

nanosoldier · 2017-06-15T21:24:39Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels

tkelman · 2017-06-16T08:12:31Z

base/range.jl


 in(x::Integer, r::AbstractUnitRange{<:Integer}) = (first(r) <= x) & (x <= last(r))
-in(x, r::Range{T}) where {T<:Integer} =


where is the fallback for in(x::Any, r::Range{<:Integer}) ?

That's the generic iterator fallback: in(x, itr) = any(y -> y == x, itr). With #21402, it will be optimized out at compile time when types cannot possibly be equal.

that's an open issue not a closed PR, so we don't have that yet? are any cases that would dispatch to this signature tracked in the benchmarks?

Indeed it's not merged yet, but that code path failed previously (which is the point of this PR), so I don't think performance matters.

@tkelman, does @nalimilan's response address your concerns? If so this PR seems in good shape? :)

a deprecation would be a bit strange but would at least make the change visible. let me take one more look at this.

If we really want to take this case into account, where isinteger is defined but the type isn't a subtype of Real, it would be possible to dispatch on the isinteger trait, no? But it appears overkill to me.

which trait? isinteger is value dependent

Well, let's say a runtime trait:

in(x, r::Range{T}) where {T<:Integer} = isinteger(x) ? _fast_real_method_below(x, r) : (x isa Real ? false : _slow_generic_iterator_method(x, r))

where _fast_real_method_below refers to in(x::Real, r::Range{T}) where {T<:Integer} defined below, where the isinteger test can be removed.
This would not work well for e.g. 1+0im, which is isinteger but not Real; then in _fast_real_method_below, the use of x can be replaced by Integer(x).

@tkelman are you fine with the state of this PR on this point?

rfourquet · 2017-07-17T08:29:24Z

base/irrationals.jl

@@ -104,6 +104,7 @@ end
 isfinite(::Irrational) = true
 isinteger(::Irrational) = false
 iszero(::Irrational) = false
+isinteger(::Irrational) = false


Already defined two lines above?

definitely redundant, should be removed

Right, it's been added in another PR since I opened this one. I've just removed it.

rfourquet · 2017-07-17T08:50:20Z

base/range.jl

 end
+in(x::Real, r::Range{<:Real}) = _in_range(x, r)
+# This method needs to be defined separately since -(::T, ::T) can be implemented


I understand that this makes the assumption that in this case, -(::T, ::T) at least returns a Real? as well as step Range(::T)?

The assumption is that -(::T, ::T) returns something which can be divided by the range's step, and the result be rounded to Integer. But that shouldn't be an issue since this method will only be called if a Range{T} has been constructed, which means these operations are supported (or should be). The problem with the current implementation is that arithmetic operations could be called on any type which isn't supposed to support them at all.

StefanKarpinski · 2017-07-18T21:25:16Z

We've got two approvals and green tests. Time to merge?

Previously, failures would happen when checking whether a non-real or irrational value was in a range, contrary to what happens with an Array. This was inconsistent with the fallback definition any(y -> y == x, itr). To fix this, restrict the signature of optimized methods to Real, which are the only types for which we can be certain the operations are supported.

n would be an Int when step(r) == 0, but possibly of another type when step(r) != 0.

nalimilan force-pushed the nl/in branch from de07903 to cd9e66b Compare May 6, 2017 14:21

Sacha0 reviewed May 6, 2017

View reviewed changes

TotalVerb reviewed May 7, 2017

View reviewed changes

Sacha0 reviewed May 7, 2017

View reviewed changes

kshyatt added the collections Data structures holding multiple items, e.g. sets label May 7, 2017

nalimilan force-pushed the nl/in branch from cd9e66b to 20a4464 Compare May 9, 2017 20:53

nalimilan changed the title ~~Fix in(x, r::Range) when x is not numeric~~ Fix in(x, r::Range{<:Integer}) when x is not numeric May 9, 2017

nalimilan force-pushed the nl/in branch from 20a4464 to 1ce9a31 Compare May 10, 2017 15:11

Sacha0 reviewed May 10, 2017

View reviewed changes

nalimilan commented May 10, 2017

View reviewed changes

Sacha0 mentioned this pull request May 10, 2017

Add isinteger with Irrational argument #21770

Merged

nalimilan force-pushed the nl/in branch from 1ce9a31 to 6b9d181 Compare May 11, 2017 08:43

Sacha0 reviewed May 11, 2017

View reviewed changes

nalimilan commented May 20, 2017

View reviewed changes

nalimilan force-pushed the nl/in branch 2 times, most recently from 19ef827 to 8bdeb8d Compare June 15, 2017 07:14

Sacha0 approved these changes Jun 15, 2017

View reviewed changes

tkelman reviewed Jun 16, 2017

View reviewed changes

rfourquet approved these changes Jul 17, 2017

View reviewed changes

nalimilan added 2 commits July 19, 2017 11:57

Fix type instability in _in_range

5d3cbf7

n would be an Int when step(r) == 0, but possibly of another type when step(r) != 0.

nalimilan force-pushed the nl/in branch from 8bdeb8d to 5d3cbf7 Compare July 19, 2017 10:05

StefanKarpinski merged commit 98f2726 into master Jul 21, 2017

StefanKarpinski deleted the nl/in branch July 21, 2017 19:56

nalimilan mentioned this pull request Oct 12, 2017

fix ambiguous method with Base.in(::CategoricalValue, ::Set) JuliaData/CategoricalArrays.jl#83

Merged

		@@ -178,6 +178,10 @@ bswap(z::Complex) = Complex(bswap(real(z)), bswap(imag(z)))

		isequal(z::Complex, w::Complex) = isequal(real(z),real(w)) & isequal(imag(z),imag(w))

		in(x::Complex, r::Range{T}) where {T<:Integer} =

		@@ -178,6 +178,8 @@ bswap(z::Complex) = Complex(bswap(real(z)), bswap(imag(z)))

		isequal(z::Complex, w::Complex) = isequal(real(z),real(w)) & isequal(imag(z),imag(w))

		in(x::Complex, r::Range{<:Real}) = isreal(x) && real(x) in r


		in(x::Integer, r::AbstractUnitRange{<:Integer}) = (first(r) <= x) & (x <= last(r))
		in(x, r::Range{T}) where {T<:Integer} =

Fix in(x, r::Range{<:Integer}) when x is not numeric #21728

Fix in(x, r::Range{<:Integer}) when x is not numeric #21728

Conversation

nalimilan commented May 6, 2017 • edited Loading

Sacha0 left a comment

Choose a reason for hiding this comment

nalimilan commented May 6, 2017

TotalVerb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 commented May 7, 2017 • edited Loading

nalimilan commented May 7, 2017

Sacha0 commented May 7, 2017

nalimilan commented May 7, 2017

Sacha0 commented May 7, 2017

JeffBezanson commented May 9, 2017

nalimilan commented May 9, 2017

nalimilan commented May 9, 2017

JeffBezanson commented May 9, 2017

nalimilan commented May 9, 2017

JeffBezanson commented May 9, 2017

nalimilan commented May 10, 2017

Sacha0 May 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 May 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 May 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 commented May 10, 2017

nalimilan commented May 10, 2017

Sacha0 commented May 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented May 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan May 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented Jun 15, 2017 • edited Loading

Sacha0 commented Jun 15, 2017

Sacha0 left a comment

Choose a reason for hiding this comment

nanosoldier commented Jun 15, 2017

nalimilan commented Jun 15, 2017

Sacha0 commented Jun 15, 2017

nanosoldier commented Jun 15, 2017

Choose a reason for hiding this comment

nalimilan Jun 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanKarpinski commented Jul 18, 2017

nalimilan commented May 6, 2017 •

edited

Loading

Sacha0 commented May 7, 2017 •

edited

Loading

Sacha0 May 10, 2017 •

edited

Loading

Sacha0 May 10, 2017 •

edited

Loading

Sacha0 May 10, 2017 •

edited

Loading

Sacha0 commented May 10, 2017 •

edited

Loading

nalimilan May 11, 2017 •

edited

Loading

nalimilan commented Jun 15, 2017 •

edited

Loading

nalimilan Jun 16, 2017 •

edited

Loading