Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findfirst on dictionaries returns ambiguous result #29565

Open
andyferris opened this issue Oct 8, 2018 · 4 comments
Open

findfirst on dictionaries returns ambiguous result #29565

andyferris opened this issue Oct 8, 2018 · 4 comments
Labels
search & find The find* family of functions

Comments

@andyferris
Copy link
Member

I was playing again with dictionaries and similar containers that may have any keytype (including Any) and was surprised by the new findfirst API. Take the following example

julia> dict = Dict(nothing => 3, 1 => 4)
Dict{Union{Nothing, Int64},Int64} with 2 entries:
  nothing => 3
  1       => 4

julia> dump(findfirst(isequal(2), dict))
Nothing nothing

julia> dump(findfirst(isequal(3), dict))
Nothing nothing

julia> dump(findfirst(isequal(4), dict))
Int64 1

In the above, the first two cases are completely ambiguous! It would be nice if I could distinguish the case that the key of the found item was was nothing, or if nothing was found. In this case, I wonder if it would be more natural for this API to return a type that you unwrap (like Some or the old Nullable or perhaps even something like a vector of length 0 or 1 which actually makes it quite compatible with the results of findall)?

(While some might consider the above example contrived, keep in mind that some AbstractDict containers are explicitly designed to reference literally all kinds of objects, IIUC especially within the compiler. It would make sense to me if the find* interface worked without limitations for these important containers).

Until now I have been more familiar with the findall API which gives distinctive answers (that I would expect) in all 3 cases:

julia> findall(isequal(2), dict)
0-element Array{Union{Nothing, Int64},1}

julia> findall(isequal(3), dict)
1-element Array{Nothing,1}:
 nothing

julia> findall(isequal(4), dict)
1-element Array{Int64,1}:
 1

I'm not sure if other people knew about this previously, or view this as a problem, and if so, how we might go about addressing this?

@nalimilan nalimilan added the search & find The find* family of functions label Oct 8, 2018
@nalimilan
Copy link
Member

nalimilan commented Oct 8, 2018

Yes, that's been discussed before. We didn't change the default behavior since it would be inconvenient for the (very common) cases where you know the keys cannot be nothing. But we need an API to get Some results for ambiguous cases -- maybe just by passing Some as an argument to the find* functions.

@jariji
Copy link
Contributor

jariji commented Jun 17, 2023

We didn't change the default behavior since it would be inconvenient for the (very common) cases where you know the keys cannot be nothing. But we need an API to get Some results for ambiguous cases -- maybe just by passing Some as an argument to the find* functions.

I think this is too ad hoc. It would work in a scripting language where it's assumed that the person calling findfirst is the end user. In a language with libraries and generic programming, it doesn't work: If a library function uses findfirst, they now need to consider whether their user might be passing them nothings, which generally they won't know, and might not even realize they have to document that it's disallowed.

Calling something(_) is annoying, and a nicer syntax like Try.jl's @? would be better, but the cost of having fragile abstractions throughout the ecosystem is worse: now the user has to put Some around their own values to accommodate unsafe calls by the library.

Cautioning against use of findfirst in generic contexts where nothing could be a key, in favor of

i, _... = findall(f, x)

for an error or

ixs = findall(f, x)
isempty(ixs) ? nothing : Some(first(ixs))

seems like the safest approach.

@jariji
Copy link
Contributor

jariji commented Jun 17, 2023

Maybe this is a more productive idea: can we throw an error if the found index is nothing? It's not perfect because it breaks the use case where the caller knows the value is present, but at least it's a loud breakage. Currently there's silent incorrectness for the case where the index is nothing under naive usage.

@jariji
Copy link
Contributor

jariji commented Jun 26, 2023

Another idea for 1.x: add a default parameter that's returned in the failure case.

This isn't a good solution for 2.0 because it tempts library authors to assume nothing is an okay default even when they don't control the input. But as a nonbreaking change it would be better than the current situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search & find The find* family of functions
Projects
None yet
Development

No branches or pull requests

3 participants