-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should allequal
use ==
instead of isequal
?
#49372
Comments
To me, it makes sense that |
I don't have a particular use case in mind. I just have a gut feeling that comparison with Here's an example inspired by a question on Discourse. Suppose we have the following matrix: m = [1 missing NaN 3 4
missing missing NaN 3 5] And then we want to call function myequals(x, y)
isnan(x) && isnan(y) && return true
x == y
end When I opened the issue and the PR, there were some objections to expanding the Base namespace, but I don't think anyone brought up the issue of ambiguity over the type of equality comparison to use. |
The way I’ve used it is for sanity-checking data, as a complement to |
First of all, it would be extremely weird if a function named Next is probably a matter of opinion, but would be nice not to pollute more and more fundamentally boolean function results with missings. For me, as someone who uses missings extremely rarely, it's already easy to forget and be surprised that For those, presumably rare, cases when the difference actually matters, a slightly more verbose |
julia> all([true, missing])
missing Once you buy in to three-valued logic at the language level, it's hard to justify not using three-valued logic. |
I think this is a non-starter because |
The only, or at least the default, equality operator in Julia functions is most typically |
Would there be room for keyword arguments like our sorting functions have? I.e., allequal([(+0.0,1), (-0.0,2)]; by=first, eq=(==)) == true
allequal([(+0.0,1), (-0.0,2)]; by=first, eq=isequal) == false I suppose this would also involve expanding |
I'm fully aware that Let me sharpen the matrix example above to make it more concrete. I think this is the use case that originally motivated me to make the julia> df = DataFrame(a=[1, missing], b=[missing, missing], c=[NaN, NaN], d=[3, 3], e=[4, 5])
2×5 DataFrame
Row │ a b c d e
│ Int64? Missing Float64 Int64 Int64
─────┼─────────────────────────────────────────
1 │ 1 missing NaN 3 4
2 │ missing missing NaN 3 5
julia> allequal.(eachcol(df))
5-element BitVector:
0
1
1
1
0 If you're not paying attention, you might think that column julia> allequal3VL(itr) = all(==(first(itr)), itr);
julia> allequal3VL.(eachcol(df))
5-element Vector{Union{Missing, Bool}}:
missing
missing
false
true
false (Granted, in this scenario you probably also need to keep an eye out for columns that are all
That seems reasonable to me. There is already an open PR to add a predicate version, Are there any objections to adding an |
This seems like a large API for a small feature. Is |
No, it's not much better. But it's a slippery slope. Once you have I think I'm now of the opinion that we (I) shouldn't have added |
This might not be very helpful, but I like |
Ok, at least this issue has verified that some people prefer it the way it is now, which is good to know. |
When I worked on the PR to add
allequal
(#43353), I usedisequal
instead of==
, becauseunique
andallunique
useisequal
. However, in retrospect that seems like the wrong choice. It seems to me that users looking forallequal
functionality will generally want to compare values with==
. In particular, it is unfortunate thatallequal
does not follow three-valued logic. So, we currently have the following behavior, which seems undesirable to me:I think for most use cases it would be better for those to return
false
,missing
, andmissing
, respectively.In the PR discussion, I mentioned the difference between using
isequal
and using==
, but nobody objected to usingisequal
. 😅Unfortunately this would be a breaking change. I would almost consider the failure to propagate missingness a bug, but technically it's not a bug since the docstring does say that comparisons are made using
isequal
.Would this qualify as one of those "technically breaking" changes that we could slip into a minor release?
The text was updated successfully, but these errors were encountered: