Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP/RFC: add 2-argument get returning Union{Nothing,Some} #34821

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JeffBezanson
Copy link
Member

This form is a bit more general, and would eventually allow us to eliminate the loathsome secret_table_token.

Obviously incomplete; we should first discuss whether we want this.

@JeffBezanson JeffBezanson added design Design of APIs or of the language itself collections Data structures holding multiple items, e.g. sets labels Feb 19, 2020
@rfourquet
Copy link
Member

I have long wanted this and am happy to see this discussed after the old attempt with nullables in #18211.

As a possible other meaning of two-args get: I'm currently writing a package with a couple implementation of multidicts, and while exploring possible APIs (still early stages of thinking), I currently implement, for a multidict d:

  • d[k] returns an iterable of all the values associated with key k, possibly empty
  • d[k] = (1, 2, 3) erase all old values of k and add new ones from the rhs iterable
  • get(d, k, default) like for dicts, but returns only one of the possible values

This is not the place to discuss such an API, but in this case there is no function which throws when no key is found, so I though that get(d, k) could be like get(d, k, default) except it would throw when k is no present in d. But for a normal dict, it would then be equivalent to d[k]. I think the meaning of get in this PR has my preference, but just wanted to share this alternative.

@andyferris
Copy link
Member

andyferris commented Feb 23, 2020

Reflecting on @rfourquet's comment I do have to say, there is a semantic that kinda works for dictionaries and a possible multi-dict. Instead of Union{Nothing, Some{T}} being returned from get, a container with zero or one elements could be returned, for example AcceleratedArrays.jl has a MaybeVector which can be the result of findall (or filter or whatever) on a container that is known to have unique elements - in the case of get the uniqueness is in the keys.

One neat thing about this as a form "nullable" type is that many operations just fall out. "Optional chaining" can be done via broadcasting. "Nullish coalescing" is somewhat like three-argument get. User's don't need to learn new concepts to deal with optional results (e.g. you can use isempty instead of isnothing - basically the API surface is reduced to just the container interface).

Anyway, I understand we settled on the Union{Nothing, Some{T}} pattern during Julia v0.7 discussions, but @rfourquet's comment reminded me of this. (I also realize this is a bit closer to the original Nullable{T}...)

(@rfourquet Multidict could have getindex require the key to be present, two argument get would be almost the same but give an empty vector instread of error when the key is not present... anyway just a thought)

@andyferris
Copy link
Member

Another strange idea: syntax sugar for two-argument get being dict?[key].

@tkf
Copy link
Member

tkf commented Feb 23, 2020

dict?[key]

This is a neat syntax but I think it'd be incompatible with f?(x) if f?(x) is going to work on Union{T, Missing} instead of Union{Some{T}, Nothing}. https://discourse.julialang.org/t/status-of-question-mark-syntax-for-missing-values/27189/2

Having said that, I cannot help mentioning that there is a nice generalization of this in the case dict?[key] appears on both hand sides. For example,

dict?[key] = something(dict?[key], 0) + 1

can be lowered to

modify!(dict, key) do val
    something(val, 0) + 1
end

where modify! is the API I suggested in #33758 to do lookup/update/insert/delete in one go.

The signature of modify! is modify!((Union{Some,Nothing} -> Union{Some,Nothing}), dict, key) -> Union{Some,Nothing} and I think it's a nice counterpart of get(dict, key) -> Union{Nothing,Some}. In fact, I think pretty much all the getter/setter functions can be written in terms of get and modify!. This is one of the reasons why I think 2-argument get is a nice addition.

jl_value_t *jl_eqtable_get1(jl_array_t *h, jl_value_t *key, int *found)
{
void **bp = jl_table_peek_bp(h, key);
if (found)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (found)

@rfourquet
Copy link
Member

rfourquet commented Mar 18, 2020

Maybe another point to consider: pop! and get are very similar in their 3-argument APIs, the difference being that pop! removes the element in addition to returning it. As pop!(dict, key) already exists and throws an error on inexistant key, that might support having the same for get (or eventually changing pop! to return Union{Nothing,Some} if get is made to do that).

Although I'm not convinced of the need for consistency here, as pop! and get have different usage patterns.

@andyferris
Copy link
Member

We have that behaviour in getindex, which already throws if there is no key. get is meant to be the “optional” variant of getindex. I suppose there could be a default to pop! - but it potentially becomes harder to reason about the mutation.

@tkf
Copy link
Member

tkf commented Mar 18, 2020

@rfourquet For Union{Nothing,Some}-based (potentially) mutating API, I suggest modify! #33758. It's quite easy to "derive" other APIs from it:

maybepop!(dict, key) = modify!(_ -> nothing, dict, key)

This returns a Union{Nothing, Some{valtype(dict)}}.

@andyferris IIUC, @rfourquet is pointing out the similarity of the existing APIs get!(dict, key, default) and pop!(dict, key, default). If get!(dict, key) returns a Union{Nothing,Some}, it is reasonable to assume the same for pop!(dict, key) if you are new to this API. However, pop!(dict, key) already exists and it throws.

I don't think we can do anything about pop!(dict, key) due to backward compatibility.

@jw3126
Copy link
Contributor

jw3126 commented Jul 10, 2020

I think this would break quite some code. For instance, it will break Setfield.jl hard. I would guess we are not the only ones that overloaded Base.get in conflicting ways. I get (no pun intended) that this overload was somewhat daring, though. I am just begging for mercy 😅

@StefanKarpinski
Copy link
Member

The life of a type pirate is fraught with peril!

@jw3126
Copy link
Contributor

jw3126 commented Jul 10, 2020

The life of a type pirate is fraught with peril!

I guess so. Just to clarify Setfield.jl overloads get(::Any, ::Setfield.Lens). So this is as much piracy as Base.*(::MyMatrix, ::AbstractMatrix). It hurts me, but I am ok with sacrificing the get overload for the greater good.

@StefanKarpinski
Copy link
Member

To be clear, I don’t have a strong opinion either way here and we should avoid causing undue breakage. Mostly just a joke.

@tkf
Copy link
Member

tkf commented Jul 10, 2020

If we want dict?[key] and in general container?[i, j, k] to be the Union{Some{T}, Nothing} variant (and not Union{T, Missing}), I think we need something other than 2-arg get for this. This is because we can't have vararg form container?[i, j, k] with get since it already has a specific 3-arg semantics.

I think @andyferris is designing "multi-dimensional" dictionary. So, thinking about multi-dimensional key for "2-arg get" might be important.

@jw3126
Copy link
Contributor

jw3126 commented Jul 11, 2020

Another possible name for this would be probe(collection, key)

@cmcaine
Copy link
Contributor

cmcaine commented Sep 17, 2020

I'm probably being silly, but what's the point of this over haskey(dict, key)?

I'm imagining code like:

if haskey(dict, key) 
    value = dict[key]
    # do something
else
    # do something on failure
end

versus

x = get(dict, key)
if x !== nothing
    value = something(x)
    # do something
else
    # do something on failure
end

There's one more hash going on in the first example, but maybe that could be optimised away?

@andyferris
Copy link
Member

There's one more hash going on in the first example, but maybe that could be optimised away?

Unfortunately, optimizing a repeated hash lookup away seems unlikely to be feasible. There's just too much un-inlined code and Julia doesn't have the pure-functional semantics that might enable the detection of bulk repeated work. (It would also be nice to have an interface that presents the efficient path for all sorts of dictionaries - less round trips to a remote database, for example).

@vtjnash
Copy link
Member

vtjnash commented Nov 12, 2021

In terms of API, perhaps this would be more consistent to call tryget, to go with the pattern set by tryparse and trylock, instead of Base.maybeview (to exhaustively list all functions that currently match that naming pattern)?

@JeffBezanson
Copy link
Member Author

I like tryget 👍

@tkf
Copy link
Member

tkf commented Apr 25, 2022

I'm suggesting to use Union{Ok,Err}-valued functions for this. See #45080

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants