Add a parametric Nullable{T} type #8152

johnmyleswhite · 2014-08-27T02:11:08Z

This adds a parametric Nullable{T} type that can represent a value of type T that may be missing. It's got a very minimal interface with the hope that this will encourage you to resolve the uncertainty of whether a value is missing as soon as possible.

Work left to do:

Get test suite working
Add a section to the manual based on the docs available at https://github.com/johnmyleswhite/NullableTypes.jl

I also did a little bit of whitespace removal along the way, which is hopefully forgivable.

JeffBezanson · 2014-08-27T02:42:29Z

👍

Is the error behavior of == part of the "encourage you to resolve the uncertainty" design?

I think we could get by without Null and NotNull. It will be simpler in the long run.

johnmyleswhite · 2014-08-27T03:56:31Z

Yeah, the error behavior was based on my thought that, since we should raise errors when comparing anything with a null value, it's easier to just not try comparing Nullable objects at all than to wait for a run-time error when your equality comparison hits its first null.

Agree that we can just use Nullable. Null and NotNull are an inheritance from a time when this code tried to imitate a standard Option type more closely.

porterjamesj · 2014-08-27T05:21:49Z

Once this is merged it's probably worth being very clear in the manual section precisely what the semantic differences are between Null{T} and Nothing. Having two types to represent absence is sure to be a point of confusion for many.

ivarne · 2014-08-27T05:40:13Z

We will have 3 concepts for nothing. Nothing/nothing, None/Void and NullableTypes. Looking forward to make http://docs.julialang.org/en/latest/manual/faq/#nothingness-and-missing-values even more complicated.

porterjamesj · 2014-08-27T05:46:02Z

Right, I forgot about None and Void.

ivarne · 2014-08-27T06:44:32Z

I wonder if get is the correct function to overload in this case. Currently get is part of the interface for collections, and Null{T} does not feel like a collection to me. Maybe a new function val or value?

toivoh · 2014-08-27T07:05:43Z

Well, it's a collection with zero or one element. Perhaps it should be
iterable and maybe even indexable. Not sure what that implies with regards
to get, though.
On Aug 27, 2014 8:44 AM, "Ivar Nesje" notifications@github.com wrote:

I wonder if get is the correct function to overload in this case.
Currently get is part of the interface for collections, and Null{T} does
not feel like a collection to me. Maybe a new function val or value?

—
Reply to this email directly or view it on GitHub
#8152 (comment).

rfourquet · 2014-08-27T07:16:33Z

I find the name slightly misleading, as a nullable value is immutable and as such can not be nulled after construction. I re-read rapidly the thread on julia-users, and I'm not sure if the question of supporting both "ontological/statistical missingness" as been decided. But if Nullable only supports statistical, someone will come request an Optional{T} (a 4th concept of nothing) for ontological missingness. I wouldn't implement differently Optional from Nullable, so why not support both with the same type? Morever @johnmyleswhite's view that "I’ve come to really like the interpretion of Option{T} [now Nullable{T}] as a 0-or-1 element container type" is the very concept behind C++'s (ontological) optional.

rfourquet · 2014-08-27T07:23:07Z

With the view of container with 0-or-1 elements, I would personally find getindex(x::Nullable) = get(x) quite natural, similar to Jameson Ref{T} type and to C++'s optional dereference operator. And it could allow to bypass the ugly unsafe_get via @inbounds.

nalimilan · 2014-08-27T07:47:28Z

Regarding ==: what's the suggested pattern to compare two Nullable? If that's get(x) == get(y), then I don't see why you wouldn't implement x == y as an equivalent, shorter syntax.

JeffBezanson · 2014-08-27T14:30:33Z

The main advantage of get is that you can specify a default value.

I don't think an Option type or this Nullable type can or should specify exactly what value-missingness means. It's for any case where there could be a value, but there isn't one right now.

In C# Nullable is a value type, so mutability is not supposed to be implied here. But I think it would be ok to call this Option or Optional instead.

The name None has become very unfortunate since it is so confusingly different from python. Python's None is actually our nothing. Our None should be renamed not to sound like a generic null-ish value. We could use VoidType or EmptyType. Nothing should probably also be renamed NothingType or something like that, so the nothing/Nothing distinction is clearer. (A related change is that the Void alias for ccall should really refer to NothingType and not None.)

StefanKarpinski · 2014-08-27T14:39:33Z

I actually very much like our choices of names for None, Nothing and nothing and have found that people are mostly not confused by them since they are so semantically apt. Perhaps Python should change it's naming instead ;-)

kmsquire · 2014-08-27T14:43:08Z

Regular expressions matching would be another good target for this once
it's merged.

On Wednesday, August 27, 2014, Stefan Karpinski notifications@github.com
wrote:

I actually very much like our choices of names for None, Nothing and
nothing and have found that people are mostly not confused by them since
they are so semantically apt. Perhaps Python should change it's naming
instead ;-)

—
Reply to this email directly or view it on GitHub
#8152 (comment).

JeffBezanson · 2014-08-27T14:52:03Z

Yes the names are apt, but I have seen people use Nothing instead of nothing several times and I can hardly blame them.
None is guilty of stealing a short, generic word for something that you almost never use, and should almost never use. Its aptness only gets it partially off the hook.

johnmyleswhite · 2014-08-27T15:26:44Z

Responses to various comments:

Should we allow ==? I'm not inclined to encourage people to compare Nullable objects. I'd rather that they test isnull and only compare values if actual values exist. == is kind of nuts no matter which perspective we take since all three possible implementations are weird: (1) == always raises an error, (2) == raises an error if at least one of the inputs is null, (3) == returns a Nullable{Bool} object when you do any comparison so that NULL == NULL => NULL. All seem bad. Universally raising errors seemed least bad.
There's not meant to be any direct commitment to either an epistemological or an ontological view of missingness here. To commit to an epistemological view, you need to implement three-valued logic, which I'd rather not do. Because this code mostly raises errors, it's closer to an ontological view. But I'd rather just offer a building block that lets you implement either.
I don't see a big need to implement functionality like getindex or iteration for Nullable since it won't increase the expressivity of the construct. But I could do it for symmetry if desired.
I personally think Nullable is a better name because it makes it easier to see that this construct is the basis for representing NULL. It is a little strange that you can't "null" a Nullable object, but I think it's at least defensibly strange.
I'm very much in favor of renaming None to EmptyType and Nothing to NothingType. The current names make those types seem much more useful than they are.

StefanKarpinski · 2014-08-27T15:40:30Z

I agree with the choice of Nullable as most intuitive to the most people. Option is a weirdly broad term that only suggests the right meaning to a very small set of people who will be using this. None doesn't actually need a name – we can write it as Union(). I don't care for renaming Nothing – yes, people misuse it, but that's going to happen.

johnmyleswhite · 2014-08-27T15:59:51Z

Updated:

Tests run now
Null and NotNull are both replaced with Nullable

johnmyleswhite · 2014-08-27T16:00:46Z

I think the fact that nothing and Nothing differ only in case is the source of misuse, though. Imagine if int were a value of type Int.

JeffBezanson · 2014-08-27T16:00:51Z

+1 to John's responses.

I kind of like the idea of using only Union(). That should help clarify the un-useful nature of the beast. Adding lots of names for things was fun back when there were only 200 of them, but now removing names is a much greater virtue.

johnmyleswhite · 2014-08-27T16:01:20Z

+1 to Union

IainNZ · 2014-08-27T16:03:49Z

None => EmptyType is not bad though, or NoneType, or anything verbose, because you rarely if ever want to type it.

IainNZ · 2014-08-27T16:04:06Z

If we change the meaning of x = [] where will users even see None?

JeffBezanson · 2014-08-27T16:07:02Z

Here's an interesting idea: rename Nothing to Void. In C, void is for things that return but don't return a value, which is what we use nothing for. A ccall with return type Void actually returns nothing in julia. There are various hacks in the system to patch around the fact that None is not actually the correct type to map C's void. I used None at first because I figured Nothing would be a lie --- the C code does not return a julia nothing value. But None is a worse lie. We should just fix this.

Void is much less problematic because it's close to the C usage, and doesn't collide with expectations from other dynamic languages.

johnmyleswhite · 2014-08-27T16:07:49Z

I like the mix of Void and Union() a lot.

StefanKarpinski · 2014-08-27T16:57:50Z

So nothing is the singleton instance of Void? And None is just Union()? Not bad. I worry that the difference between nothing and Void is going to be very confusing though. In C, void is the type with no instances, so it's pretty confusing that it would have an instance in Julia. Ptr{Void} would no mean what it means now.

JeffBezanson · 2014-08-27T17:40:59Z

That's a fair point --- in the case of Ptr{Void}, Ptr{None} is actually correct: dereferencing it is an error. We could instead hack in an error for dereferencing Ptr{Nothing}, but then of course the hack has just moved elsewhere.

Interestingly the following C code seems to be legal:

void f() {
    void *p = 0;
    return *p;
}

int main() {
    f();
    return 0;
}

This compiles, runs, and doesn't segfault. So giving nothing and not touching memory when deref'ing a Ptr{Void} is actually not so different from C.

johnmyleswhite · 2014-08-27T18:26:54Z

Updated with a draft of a manual section. I'm really bad with RST, so please make sure I haven't done anything very stupid. In particular, I'm worried about the interaction of doctest with a snippet of code that's supposed to throw errors.

StefanKarpinski · 2014-08-27T19:17:07Z

So, I'm a bit concerned that Nullable(T) where T is a type is ambiguous: did you want Nullable{T}() or Nullable{DataType}(T)?

JeffBezanson · 2014-09-19T15:07:42Z

I also think Nullable{T}() is perfectly analogous to Dict{T,S}(). They both basically make empty containers. They should match; if one is bad then the other is bad too, and we need a different convention for empty containers.

johnmyleswhite · 2014-09-19T15:09:19Z

I think that's a good argument for @eschnett's proposed solution.

quinnj · 2014-09-19T15:34:15Z

@JeffBezanson, note my proposal was Null{T}(x::Nullable{T}) = isnull(x) ? x : Null(T) with Null, not Nullable, meaning you would always get a null Nullable back when using Null, whether called on a null Nullable or non-null Nullable.

If we go with the Null and NotNull constructors, I don't see why Null{T}(x::T) = Nullable{T}() couldn't be had (in addition to the special case for Nullable above).

julia> NullableTypes.Null{T}(x::T) = Nullable{T}()
Null (generic function with 2 methods)

julia> Null(1)
Null(Int64)

julia> Null(Int)
Null(Int64)

JeffBezanson · 2014-09-19T15:45:56Z

You seem determined to introduce some f(x) whose behavior is subtly different based on whether x is a type. I simply don't see the advantage of this. Even if one considers it acceptable, I don't see how one can argue it is the simplest and least confusing option.

The danger here is that Nullable is extremely generic, parametrically polymorphic to the max: it makes equal sense for absolutely any value.

The function oftype of course has a similar sketchiness: oftype(1,1.0) and oftype(Int,1.0) both work. I don't love that either, but we can just barely get away with it because converting 1.0 to typeof(Int) doesn't make sense. However Null(x) easily makes sense for all x, so there is not much reason to sometimes take the type of x and other times not.

quinnj · 2014-09-19T15:59:10Z

No, that makes sense. Particularly the arguments for simplicity and the power of Nullable. I'd vote then to go with the Nullabe{T}() and Nullable(x::T) options. Looking forward to kicking the tires on this some more (for the ODBC and SQLite packages).

kmsquire · 2014-09-19T18:02:37Z

I also think Nullable{T}() is perfectly analogous to Dict{T,S}(). They
both basically make empty containers. They should match; if one is bad then
the other is bad too, and we need a different convention for empty
containers.

I had argued for this change before:
#4871 (comment)

I think it would be better to have a consistent convention for
creating typed containers (Arrays, Dicts, Sets, and the various containers
in DataStructures.jl). Currently, Dicts and Sets are special.

JeffBezanson · 2014-09-19T18:16:37Z

See also #3214. I dislike things like Container(T) more and more, since it's totally unclear which are type parameters and which are elements. The plan is for Array to remain the lone exception until #1470 is fixed.

eschnett · 2014-09-19T18:21:37Z

It's probably way too late in the discussion to bikeshed the name of the type... I don't like the name Nullable, as this implies that one can perform a certain action on the respective object. For example, Comparable would imply that == is defined, and Printable would indicate that the type can be output.

Nullable does not indicate such a property; an nval::Nullable{Int} is an immutable object, and there is no operation e.g. null(nval) that would modify nval. Also, the notion of null is tied to C and pointers, which is very different from the implementation here, which is more efficient.

Haskell calls this type Maybe (you have maybe an int, and maybe you have nothing) -- a cool name, but it takes a bit getting used to. Boost calls it Optional (you may have an int, or you may not) -- this is probably a good name that everybody immediately understands.

I like Optional. To check whether an optional value is present, one could call a function ispresent (instead of isnull).

kmsquire · 2014-09-19T18:27:54Z

+1 for consistency and Nullable{T}() then!

JeffBezanson · 2014-09-19T18:35:51Z

@eschnett I agree with everything you've said in this thread, including that it is too late to bikeshed the name :)

I'm actually not particularly attached to Nullable, and would be ok with Maybe or Optional or perhaps Opt if you're into the whole brevity thing. But as I said above Nullable is a well-established term of art that does not imply mutability. Interestingly, your examples Comparable and Printable also do not involve mutation. Nullable does in fact imply certain non-mutating methods, like isnull and get. Your argument actually supports the position that -able is not tied to mutation; it does not support your stated position.

nalimilan · 2014-09-19T18:43:03Z

@eschnett I think one of the points in favor of the Nullable term is that in SQL missing values are called NULL, and dealing with missing data is one of the big interests of this new type. Nullable is also called that way in C# and Java, though Option and Maybe seem to be equally popular, according to Wikipedia.

eschnett · 2014-09-19T18:48:24Z

The term "Nullable" indicates that there is some kind of operation that the object supports, namely "nulling" it. This does not really indicate that (a) there is a function isnull, or (b) there may not be a value present. That's what I meant when I spoke about modifying -- the term "nulling" sounds as if something could be modified, and that's not the case. I didn't mean to imply that the suffix "-able" indicates mutability, as you agree.

I guess we come from different programming language backgrounds. When I compare https://en.wikipedia.org/wiki/Nullable_type and https://en.wikipedia.org/wiki/Option_type, then I'd place Julia firmly in the latter category...

JeffBezanson · 2014-09-19T19:04:34Z

I read the Nullable type article, and nowhere does it mention an operation of "nulling" a value. The Option type article says "Outside of functional programming, these are known as nullable types." That seems to mean different kinds of programming have different names for the same thing, not that there are different kinds of option types (e.g. mutable vs immutable).

We agree that "X is nullable" does not imply that X supports some mutating operation. Why then would you say that "nullable" implies "nulling", which is a mutating operation? Maybe "nulling" means "constructing a similar value that is null". So I think the term is reasonable, making some allowance for the limitations of human language.

johnmyleswhite · 2014-09-19T19:52:55Z

FWIW, I think the argument about names is not likely to prove fruitful.

First off, the English suffix "-ble" does not precommit the earlier morpheme to any specific interpretation: compare livable, visible, defensible, potable, etc. Some of these involve transitive verbs, but some do not.

Second off, our type isn't identical to an Option or Maybe, since it's not a tagged union, but a distinct parametric type. This a somewhat minor point, but using a distinct name will help to keep the type theory folks from complaining about the use of terms that they perceive to have highly specialized meanings.

JeffBezanson · 2014-09-19T19:59:44Z

I agree the naming debate is not very fruitful, I was just starting to enjoy it :)

JeffBezanson · 2014-09-19T23:08:50Z

With the Nullable{T}() change I would like to merge this.

johnmyleswhite · 2014-09-20T00:29:55Z

Updated to use Nullable{T}(). Should be good to go now.

johnmyleswhite · 2014-09-20T00:55:20Z

And Travis gives us the green light.

johnmyleswhite · 2014-09-20T15:47:16Z

Bump.

Add a parametric Nullable{T} type

IainNZ · 2014-09-20T16:36:57Z

🍰

johnmyleswhite force-pushed the jmw/nullable branch from 529893b to 4078c04 Compare August 27, 2014 15:57

johnmyleswhite force-pushed the jmw/nullable branch from 4078c04 to f983e0e Compare August 27, 2014 18:26

Add a parametric Nullable{T} type

892e746

johnmyleswhite force-pushed the jmw/nullable branch from 48856b8 to 892e746 Compare September 20, 2014 00:28

johnmyleswhite mentioned this pull request Sep 20, 2014

Rename None to Union() and Nothing to Void? #8423

Closed

JeffBezanson added a commit that referenced this pull request Sep 20, 2014

Merge pull request #8152 from JuliaLang/jmw/nullable

7f47e6b

Add a parametric Nullable{T} type

JeffBezanson merged commit 7f47e6b into master Sep 20, 2014

JeffBezanson deleted the jmw/nullable branch October 25, 2014 17:23

ihnorton mentioned this pull request Dec 15, 2014

RFC: Missing values by Sentinels #9363

Closed

vchuravy mentioned this pull request Dec 25, 2014

[RFC] Map for Nullable #9446

Closed

jiahao mentioned this pull request Dec 28, 2014

Update "Try to avoid nullable fields" advice #9480

Closed

andyferris mentioned this pull request Oct 20, 2016

Implement more operators on Nullable with lifting semantics #19034

Closed

Add a parametric Nullable{T} type #8152

Add a parametric Nullable{T} type #8152

Conversation

johnmyleswhite commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

porterjamesj commented Aug 27, 2014

ivarne commented Aug 27, 2014

porterjamesj commented Aug 27, 2014

ivarne commented Aug 27, 2014

toivoh commented Aug 27, 2014

rfourquet commented Aug 27, 2014

rfourquet commented Aug 27, 2014

nalimilan commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

StefanKarpinski commented Aug 27, 2014

kmsquire commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

StefanKarpinski commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

IainNZ commented Aug 27, 2014

IainNZ commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

StefanKarpinski commented Aug 27, 2014

JeffBezanson commented Aug 27, 2014

johnmyleswhite commented Aug 27, 2014

StefanKarpinski commented Aug 27, 2014

JeffBezanson commented Sep 19, 2014

johnmyleswhite commented Sep 19, 2014

quinnj commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

quinnj commented Sep 19, 2014

kmsquire commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

eschnett commented Sep 19, 2014

kmsquire commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

nalimilan commented Sep 19, 2014

eschnett commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

johnmyleswhite commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

JeffBezanson commented Sep 19, 2014

johnmyleswhite commented Sep 20, 2014

johnmyleswhite commented Sep 20, 2014

johnmyleswhite commented Sep 20, 2014

IainNZ commented Sep 20, 2014