`1 < missing < 3` fails #31171

cstjean · 2019-02-25T15:42:05Z

julia> 2 < missing < 3
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] top-level scope at none:0

Should be missing, no?

The text was updated successfully, but these errors were encountered:

Keno · 2019-02-25T17:15:52Z

The syntax has short-circuiting semantics [1] (i.e. the upper bound doesn't get evaluated if the first condition fails. Of course there could be some special exception for missing, but it would certainly make the lowering significantly more complicated.

[1] i.e. a < b < c is equivalent to a < b && b < c.

cstjean · 2019-02-25T18:03:11Z

It explains this:

julia> missing < 1 < 3
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] top-level scope at none:0

julia> 3 > 1 > missing
missing

nalimilan · 2019-02-27T16:52:46Z

I agree it would be nice if this worked. For example a typical pattern would be ~~filter(x -> 30 < x < 60, age)~~ (or something similar on data frame rows) EDIT: this fails in the presence of missing values, it only makes sense with a special function which would skip missing values as in SQL or R.

But we probably don't want to special-case missing, so we would have to use & instead of && in all cases, which isn't possible before 2.0.

cstjean · 2019-02-28T09:11:29Z

FWIW this already works:

julia> 2 .< missing .< 3
missing

ararslan · 2019-03-07T21:57:35Z

Triage discussed this for quite a while, and none of the options for implementing this are that appealing. We did not come to a conclusion in this discussion. Some of the options discussed were:

Special-case missing in the front-end (not really a good option)
Use & (integers have this return an integer, which is a problem)
Do nothing

Speaking on behalf of myself (not of triage), I think I'm beginning to favor doing nothing here. This may be one of the cases where users should be wary of missing values and use explicit ismissing checks as appropriate. It's not an ideal solution, but it's explicit.

Keno · 2019-03-07T22:01:45Z

I had proposed the following lowering of a < b < c, which some people liked, but didn't get much love otherwise.

#1 = a
#2 = b
r = #1 < #2
if r === true
     r = #2 < c
else
     r = &(r, true)
end
return r

mbauman · 2019-03-07T22:25:03Z

I didn't quite catch it on the call; why is the last r = &(r, true) needed? Isn't the fact that < should really only return Bool (or missing-like singletons) enough?

JeffBezanson · 2019-03-07T22:28:18Z

It at least requires the thing you return to support &, so it's similar to switching the lowering to use & except it's still lazy.

mbauman · 2019-03-07T22:42:40Z

I get that, but it feels strange to me since we don't call & in the other branch. If we're going to do anything at all here I'd favor the simpler:

#1 = a
#2 = b
r = #1 < #2
if r === true
     r = #2 < c
end
return r

That said, I'm not wholly convinced we actually need to do something here. @nalimilan's filtering example would be compelling, except that filter wouldn't even work if its function returns missing (not even DataFrames' specialization). Typically if I do this as a scalar operation, I'll be branching on the result in short order. If I just want to store the result, I think it'd be much more common to just broadcast — which lowers to (a .< b) .& (b .< c).

ararslan · 2019-03-07T22:53:12Z

If I just want to store the result, I think it'd be much more common to just broadcast

Not sure what you mean, could you elaborate?

Keno · 2019-03-07T22:55:22Z

I get that, but it feels strange to me since we don't call & in the other branch

Why? We force the LHS to return a boolean now, but not the RHS

mbauman · 2019-03-07T23:13:43Z

If I just want to store the result, I think it'd be much more common to just broadcast

Not sure what you mean, could you elaborate?

There are two ways of using the result from a < b < c.

You can use it to make a decision. If we propagated a missing out of it, it would throw an error as soon as you tried to branch on if missing. Implementing this would just move the error one step down the road in an if a < b < c expression.
You can store the result or do some more computations on it later down the road (and don't branch on it). In my mind, this is much more common to be done on many values at once — and thus you'd typically just use broadcast.

Why? We force the LHS to return a boolean now, but not the RHS

It just gets back to my "what's the point" question. If it's to emulate & semantics, then it seems like we should actually call & on both branches. If it's to ensure we get a logicky value back from <, then that seems like the kind of thing we tend to say "don't do that" instead of adding complications like this — particularly complications in lowering.

cstjean · 2019-03-07T23:18:12Z

For the sake of concreteness: IntervalSets.jl didn't support missing, and I found out that their methods:

in(v, I::TypedEndpointsInterval{:closed,:closed}) = leftendpoint(I) ≤ v ≤ rightendpoint(I)
in(v, I::TypedEndpointsInterval{:open,:open}) = leftendpoint(I) < v < rightendpoint(I)
in(v, I::TypedEndpointsInterval{:closed,:open}) = leftendpoint(I) ≤ v < rightendpoint(I)
in(v, I::TypedEndpointsInterval{:open,:closed}) = leftendpoint(I) < v ≤ rightendpoint(I)

would have worked fine already if this issue was fixed, but I had to add other methods to make it work:

in(::Missing, I::TypedEndpointsInterval{:closed,:closed}) = !isempty(I) && missing
in(::Missing, I::TypedEndpointsInterval{:open,:open}) = !isempty(I) && missing
in(::Missing, I::TypedEndpointsInterval{:closed,:open}) = !isempty(I) && missing
in(::Missing, I::TypedEndpointsInterval{:open,:closed}) = !isempty(I) && missing

It's not the end of the world of course. It does raise the issue of whether 3 < missing < 1 should be missing, or false.

nalimilan · 2019-03-08T09:14:12Z

Sorry, my example with filter was indeed incorrect. I had a SQL- or R-like behavior in mind, where missing values are skipped, and which we should probably support for DataFrames (e.g. via an argument -- I can't find the place where we discussed this before). @cstjean's example is much better since it's not hypothetical.

StefanKarpinski · 2019-03-08T17:26:21Z

It does raise the issue of whether 3 < missing < 1 should be missing, or false.

This is a really good point. If we're going to go out of our way to handle missing cleverly, should we try to handle this? Is that even a reasonable thing to try to do?

nalimilan · 2019-03-11T14:05:51Z

I can't imagine returning anything other than missing, as that's consistent with what we do everywhere else. Incidentally that seems to be what @cstjean does in his example.

mbauman · 2019-03-11T14:10:12Z

Nope, he elected to return false in the case akin to 3 < missing < 1 (isempty(I)) — and I agree with that choice.

nalimilan · 2019-03-11T14:19:45Z

Ah, my bad, I hadn't realized the order of 3 and 1 was the question at hand here. Then I'd say either missing or false are reasonable. false would be clever, and consistent with the fact that we use all available information when computing the return value in three-valued logic (e.g. false & missing === false). But it would require a more complex lowering.

cstjean · 2019-03-11T19:42:00Z

If we give up on the short-circuiting behaviour, couldn't a < b <= c lower to something like compare(a, <, b, <=, c), with compare being a generated function?

oxinabox · 2019-03-11T19:50:00Z

3 < missing < 1 evaluting to false is similar to the interesting cases from #31066 (comment)

@test clamp(1, 52, missing) == 52
@test clamp(1000, missing, 101) == 101

ararslan added the missing data Base.missing and related functionality label Feb 25, 2019

nalimilan added the triage This should be discussed on a triage call label Feb 27, 2019

Keno removed the triage This should be discussed on a triage call label Feb 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`1 < missing < 3` fails #31171

`1 < missing < 3` fails #31171

cstjean commented Feb 25, 2019

Keno commented Feb 25, 2019

cstjean commented Feb 25, 2019

nalimilan commented Feb 27, 2019 •

edited

Loading

cstjean commented Feb 28, 2019

ararslan commented Mar 7, 2019

Keno commented Mar 7, 2019

mbauman commented Mar 7, 2019

JeffBezanson commented Mar 7, 2019

mbauman commented Mar 7, 2019

ararslan commented Mar 7, 2019

Keno commented Mar 7, 2019

mbauman commented Mar 7, 2019 •

edited

Loading

cstjean commented Mar 7, 2019

nalimilan commented Mar 8, 2019

StefanKarpinski commented Mar 8, 2019

nalimilan commented Mar 11, 2019

mbauman commented Mar 11, 2019 •

edited

Loading

nalimilan commented Mar 11, 2019

cstjean commented Mar 11, 2019

oxinabox commented Mar 11, 2019

1 < missing < 3 fails #31171

1 < missing < 3 fails #31171

Comments

cstjean commented Feb 25, 2019

Keno commented Feb 25, 2019

cstjean commented Feb 25, 2019

nalimilan commented Feb 27, 2019 • edited Loading

cstjean commented Feb 28, 2019

ararslan commented Mar 7, 2019

Keno commented Mar 7, 2019

mbauman commented Mar 7, 2019

JeffBezanson commented Mar 7, 2019

mbauman commented Mar 7, 2019

ararslan commented Mar 7, 2019

Keno commented Mar 7, 2019

mbauman commented Mar 7, 2019 • edited Loading

cstjean commented Mar 7, 2019

nalimilan commented Mar 8, 2019

StefanKarpinski commented Mar 8, 2019

nalimilan commented Mar 11, 2019

mbauman commented Mar 11, 2019 • edited Loading

nalimilan commented Mar 11, 2019

cstjean commented Mar 11, 2019

oxinabox commented Mar 11, 2019

`1 < missing < 3` fails #31171

`1 < missing < 3` fails #31171

nalimilan commented Feb 27, 2019 •

edited

Loading

mbauman commented Mar 7, 2019 •

edited

Loading

mbauman commented Mar 11, 2019 •

edited

Loading