Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Remove array type assertion for find to allow other iterables #16022

Closed
ararslan opened this issue Apr 23, 2016 · 13 comments
Closed

Comments

@ararslan
Copy link
Member

Other find-like functions, e.g. findfirst, findlast, findprev, etc., accept a string argument because the type of the iterable doesn't matter, but find requires an AbstractArray. Consider the following code, which is how find is implemented as of this writing:

function find(testf::Function, A::AbstractArray)
    # use a dynamic-length array to store the indexes, then copy to a non-padded
    # array for the return
    tmpI = Array(Int, 0)
    for (i,a) = enumerate(A)
        if testf(a)
            push!(tmpI, i)
        end
    end
    I = Array(Int, length(tmpI))
    copy!(I, tmpI)
    I
end

It seems to me that it would work just as well to remove the type for A, which would make this function more consistent with its find-like friends.

Thoughts?

@nalimilan
Copy link
Member

Makes sense. In that case, I also change find consistency. This might require a bit more work, since it uses similar, which isn't defined for non-AbstractArray. Though that's another case where the use of similar can be questioned: since both the element type and the size are passed, in practice AFAICT we always create an Array, which means we might as well call Array and document it.

@nalimilan
Copy link
Member

Oh, and would you mind preparing a PR (maybe wait for a bit in case somebody disagrees with your proposal)?

@ararslan
Copy link
Member Author

@nalimilan I'm not sure what you mean about changing the find consistency. The method find(A::AbstractArray) uses similar but that's using the definition of find that locates nonzero elements in the input. I suppose the closest analogue to that for a string would be finding non-null bytes, but I don't know how much sense that would really make to have in there. Is that what you meant?

Regarding the PR, sure that'd be no problem, though the change I'm proposing would be as simple as removing ::AbstractArray from the method signature, i.e. find(testf::Function, A).

@nalimilan
Copy link
Member

find(A) doesn't make much sense for strings indeed (and with #16024 it won't even work), but strings aren't the only kind of iterable type which one might want to pass to find. And it's always good to keep the API as consistent and as general as possible. So if you change find(f, A), please also change find(A).

@ararslan ararslan changed the title Feature request: String method for find Feature request: Remove array type assertion for find to allow other iterables Apr 25, 2016
@ararslan
Copy link
Member Author

@nalimilan Out of curiosity, what would you expect find("hello") to do? It would be a fairly trivial modification to make similar work for iterables other than arrays (just collect it for the first argument), but the check against 0 in the loop would could fail in weird ways, not just when int/char comparison is deprecated.

The more I think about this, the more it seems like the AbstractArray assertion really belongs in find(A). Otherwise there are too many odd edge cases with various iterables for find(A). Removing the assertion for find(testf::Function, A) wouldn't run into those kinds of problems though because no comparisons are done, only evaluation of the given function and iteration through the given iterable.

@StefanKarpinski
Copy link
Member

Giving all the indices of non-NUL characters would be a pretty weird behavior for this.

@ararslan
Copy link
Member Author

@StefanKarpinski Yeah, I can't think of a result that would make sense for find("string"). It'd be great to be able to find indices of a string based on a function though. I'm not convinced that removing the array type assertion for both cases of find is the best way to go here. But I don't know. ¯_(ツ)_/¯

@StefanKarpinski
Copy link
Member

I think that having find("string") giving the array of indices of all characters would make sense.

@ararslan
Copy link
Member Author

Pull requested! 😄 I made it so that find will return all indices of a string as @StefanKarpinski suggested. Thank you both so much for your comments, I really appreciate it!

@nalimilan
Copy link
Member

I think that having find("string") giving the array of indices of all characters would make sense.

-1 That's would be inconsistent with the definition of find for general iterables. IMHO the best behavior for find("string") is a MethodError. The only consistent alternative is to return the indices of non-null characters, but that's not really useful nor intuitive.

@ararslan
Copy link
Member Author

ararslan commented Apr 29, 2016

The problem with making string input a MethodError is that it would be inconsistent between strings and string iterators, e.g. Base.UTF8proc.GraphemeIterator, which would still produce an array of all indices.

@nalimilan
Copy link
Member

You're right, though with the idea of making == raise an error for incomparable types (#16024 (comment)), both would fail. But for now better apply the same code to all iterables so that the behavior is consistent.

@Keno Keno closed this as completed May 4, 2016
@Keno
Copy link
Member

Keno commented May 4, 2016

Closed by #16110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants