Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename haskey to hasindex #25105

Closed
wants to merge 1 commit into from
Closed

Rename haskey to hasindex #25105

wants to merge 1 commit into from

Conversation

andyferris
Copy link
Member

Part of #25104.

Basically, the idea is to replace our "key" terminology with "index" terminology, for consistency and to reduce the number of words used to describe the same thing. After discussion, it seems that "index" makes some sense for arrays and dictionaries, while "key" seems wrong for arrays, so we'd unify these in the direction of "index".

@vtjnash
Copy link
Member

vtjnash commented Dec 15, 2017

Seems fine as a rename, but what does hasindex mean for an array? The doc-string here may need an update. haskey is pretty simple: there exists an element with the given key (which reflexively means you can use it as an index). But what should this mean for other containers (such as arrays, but also other linear operators and strings), which have many, possibly non-enumerable, possibly non-consecutive bounds that can be used as valid indexes?

@yuyichao
Copy link
Contributor

Ref #18843

@nalimilan
Copy link
Member

CI failure is due to missing docstring:

1330.515498  !! No doc found for reference '[`hasindex`](@ref)'. [src/stdlib/collections.md]

@StefanKarpinski
Copy link
Member

Looking at this, I'm not sure I like it as much as I thought I would (unlike the axes change which I really rather like). I think that @vtjnash's questions are on point. Of course, the current situation where we have haskey and getindex are not quite harmonious...

@nalimilan
Copy link
Member

I don't see the problem personally. An array "has" an index if it's a valid index according e.g. to pairs, that's all? It could also allow checking whether an index is valid for a string.

@vtjnash
Copy link
Member

vtjnash commented Dec 15, 2017

The problem is the pairs returns a specific interpretation of an array that is not shared by all arrays (they could also be multidimensional/Cartesian or linear/Integer, or unassigned, for example). Also that we classify indexing failures very differently, illustrating that it’s hard to define what makes a valid index. (haskey/KeyError, isassigned/UndefRefError, String/UnicodeError)

@nalimilan
Copy link
Member

It's hard to define in general, but it's pretty clear for each given type whether an index exists or not. Any index which can be passed to getindex without getting an error is clearly valid.

@StefanKarpinski
Copy link
Member

Yes, is there anything wrong with the definition that x "has index" i if x[i] would return a value?

@vtjnash
Copy link
Member

vtjnash commented Dec 15, 2017

Do you also plan to change keys([]) to return a EachArrayIndexIterator (which filters out UndefVarError) and define isequal(::Int, CartesianIndex) to freely convert between those two representations. Do you intend for a return value of false to mean something also? In that that case, how do you handle linear indexing (or more generally, all cases where the indexes are non-unique)? These problems have been previously avoided by defining it based on the "key" attribute of the contained data and only for containers where each element only has exactly one key, by construction (aka "AbstractDict").

@andyferris
Copy link
Member Author

Great questions, Jameson.

First, it seems haskey(x, i) = i in keys(x) has always been the relevant definition here, but this isn't currently defined for arrays. This PR doesn't address that, it's just saying that we now wish to call the "keys" of a dictionary the "indices".

It's an important insight that arrays have multiple version of keys/indices (there are at least three ways of indexing, one linear indexing array[i], and two multidimensional indexing array[i, j, k] and array[CartesianIndex(i, j, k)]).

Maybe we can just overload haskey(x::AbstractArray, i) to make this work? I agree it's quite disappointing that it's hard to have a container returned from keys which makes sense with in and all forms of getindex, without the yucky hacks you mention.

It's for reasons like this that I always found linear indexing to be something distinct from standard indexing, potentially deserving a different function entirely. (E.g. maybe linear indices are tokens, not keys?)

@vtjnash
Copy link
Member

vtjnash commented Dec 15, 2017

that arrays have multiple version of keys/indices

Writing keys/indices this way skips my second question: what should it do when a valid index is not a valid key? An example of this is Array(uninitialized, 2). The axes here are (1:2,), but the valid indexes (keys?) are Ø.

Maybe we can just overload

Maybe so. But if linear indexing is not the standard, and multidimensional indices are, what does it mean to say in(i, j, k, indices()) and how would you express that in general terms?

And if we need to overload it, what should we define it to generically mean?

@ViralBShah
Copy link
Member

Closing as this was pre-1.0.

@ViralBShah ViralBShah closed this May 15, 2020
@DilumAluthge DilumAluthge deleted the ajf/hasindex branch March 25, 2021 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants