type system design iteration #8974

JeffBezanson · 2014-11-11T03:16:31Z

@jakebolewski, @jiahao and I have been thinking through some type system improvements and things are starting to materialize. I believe this will consist of:

Use DataType to implement tuple types, which will allow for more efficient tuples, and make tuple types less of a special case in various places. This might also open the door for user-defined covariant types in the future. (DONE)
Add UnionAll types around all uses of TypeVars. Each UnionAll will bind one TypeVar, and we will nest them to allow for triangular dispatch. So far we've experimented with just sticking TypeVars wherever, and it has not really worked. We don't have a great syntax for this yet, but for now will probably use @UnionAll T<:Int Array{T}. These are mostly equivalent to existential types.
Unspecialized parametric types will actually be UnionAll types. For example Complex will be bound to @UnionAll T<:Real _Complex{T} where _Complex is an internal type constructor.
Technically, (Int,String) could be a subtype of @UnionAll T (T,T) if T were Union(Int,String), or Any. However dispatch doesn't work this way because it's not very useful. We effectively have a rule that if a method parameter T only appears in covariant position, it ranges over only concrete types. We should apply this rule explicitly by marking TypeVars as appropriate. So far this is the best way I can think of to model this behavior, but suggestions are sought.
TypeVars should only be compared by identity, and are not types themselves. We will not have syntax for introducing lone TypeVars; @UnionAll should suffice.

This is being prototyped in examples/juliatypes.jl. I hope this will fix most of the following issues:

method lookup and sorting issues:
#8959 #8915 #7221 #8163

method ambiguity issues:
#8652 #8045 #6383 #6190 #6187 #6048 #5460 #5384 #3025 #1631 #270

misc. type system:
#8920 #8625 #6984 #6721 #3738 #2552 #8470

In particular, type intersection needs to be trustworthy enough that we can remove the ambiguity warning and generate a runtime error instead.

The text was updated successfully, but these errors were encountered:

Jutho · 2014-11-11T08:51:01Z

+100
and another +100 for the 0.4 milestone.
0.4 will be so great

timholy · 2014-11-11T11:37:42Z

This will be one heck of an improvement. I'm not sure I fully appreciate your intentions with ForAll, but I'm already eagerly anticipating it.

StefanKarpinski · 2014-11-11T13:04:59Z

Currently the scoping of the variable in typevars is both vague and limited. By explicitly wrapping them in a universal quantifier – i.e. ForAll – you can address both: the scope becomes explicit and one can therefore do much more sophisticated things.

toivoh · 2014-11-11T14:07:26Z

This sounds great, but can you elaborate on the part

It turns out that we effectively have a rule that if a method parameter T only appears in covariant position, it is made invariant instead.

When does it apply? Which positions are considered covariant and which invariant? (Are there any contravariant positions?) Is that different in any way from making type parameters always invariant?

jiahao · 2014-11-11T14:17:43Z

@toivoh the idea was to describe how diagonal dispatch currently works.

Consider:

f{T}(x::T, y::T) = x + y#matches a concrete type like T = Float64 but not an abstract type like T = Real
f{T}(A::Matrix{T}, b::T) = A .+ b #matches both T = Float64 and T = Real in 0.4-dev but only T = Float64 in 0.3

The first method has input signature T. (T, T) but unlike ordinary tuples, only allows for (Float64, Float64), not (Real, Real) even though the former is a subtype of the latter. Therefore tuples in this context do not behave covariantly, but invariantly.

f(3.0, 4.0) #arguments of type (Float64, Float64)
f(3, 4.0) #arguments of type (Int, Float64) <: (Real, Real) but does not match

The second has signature T. (Matrix{T}, T) and is matched invariantly in 0.3, but covariantly in 0.4.

f(zeros(Real, 3, 3), 4.0) #arguments of type (Matrix{Real}, Float64) <: (Matrix{Real}, Real)

The general rules appear to be:

v0.3: match tuples of input arguments invariantly, always
v0.4: match tuples of input arguments covariantly, unless the type parameter only appears in a fully covariant context (like (T, T, T, T...)), in which case, match the tuple invariantly.

The v0.4 rule is admittedly strange when written out like that.

JeffBezanson · 2014-11-11T15:23:46Z

I don't think it's the whole tuple that's matched invariantly, just the typevar. For example f{T}(x::Real, y::T, z::T) still allows any Real as the first argument.

toivoh · 2014-11-12T08:32:51Z

The new covariant behavior of ::T in e.g. f{T}(A::Matrix{T}, b::T) was news to me, very much appreciated! (I do follow most issues though, has this been discussed anywhere?) As I understand it, before this, type parameters were always considered invariant.

I agree that it's useful to be able to match e.g. f{T}(x::T, y::T) invariantly (or equivalently, force T to be concrete there), but I'm not sure that it's not useful to be able to match it covariantly.

I understand the rule from the point of view that when trying to match a signature against an actual argument tuple, each type parameter can always be bound to a type derived from one of the argument types, and then we check if the other constraints are fulfilled. I also see that this rule would minimize breakage while letting ::T be covariant in cases like f{T}(A::Matrix{T}, b::T).

Still, I'm not quite comfortable with the special casing. It's easier that you unwittingly change a case from covariant to invariant or vice versa, and if you haven't read the manual very carefully, you might not even be aware of the distinction. Could we consider having :: be always covariant even on type parameters, and have another way to get the invariant behavior, like e.g. on of

f{T}(x:::T, y:::T) = x + y
f{T}(x::=T, y::=T) = x + y

This would make it immediately clear to the reader that something different is going on.
The ::: form would be in some kind of analogy with ===, which is stricter than ==.
Or maybe the type parameter T itself could be marked as invariant in some way.

StefanKarpinski · 2014-11-12T13:31:53Z

I have to admit that I'm not entirely comfortable with the special casing either, although I realize we've been doing something similar for a rather long time. Not sure how else to handle it, although I would mention that if you can dispatch on type paremeters being concrete, then covariance would be a good choice.

jiahao · 2014-11-12T19:58:58Z

@JeffBezanson @jakebolewski and I have had a followup discussion and we feel that ForAlls is not the right use of ∀ in the sense of universally bounded quantifiers in type theory. Perhaps UnionAll is a better name.

toivoh · 2014-11-12T20:51:23Z

Thinking about this some more, I realize that it is mostly an issue of consistency rather than functionality. Looking at the prototypical case where a type parameter appears only in covariant position,

f{T}(x::T, y::T) = T

the covariant behavior can be achieved anyway with one of

f{S,T}(x::S, y::T) = Union(S,T)
f{S,T}(x::S, y::T) = typejoin(S,T)

depending on how you want the matching of T to behave. That alone might be reason enough to forbid a type parameter to act covariantly everywhere it is used. It would require to specify a special case behavior, and it feels better require the user to choose explicitly instead. So if we should do this for consistency, I would probably vote to make f{T}(x::T, y::T) an error and require that the user specify that T behave invariantly or use one of the patterns based on f{S,T} above.

We would of course still permit type parameters appearing only once covariantly such as in f{T}(x::T) = T, where both the covariant and invariant behavior would be to set T to the type of x.

When it comes to functionality, the only thing it seems that we stand to lose with the rule is the possibility to let e.g. T work in an invariant manner in f{T}(::Array{T}, ::T). Not sure how big that loss that is.

toivoh · 2014-11-12T21:02:06Z

One possibility for marking a type parameter as concrete might be to allow something like

f{T::Concrete}(x::T, y::T) = T

I have no idea how much havoc it would wreak in the type system to have all concrete types and no others be instances of Concrete.

It appears that allowing :: relations on type parameters would widen the domain of dispatch even further, allowing to dispatch on the types of type parameters and not just on the types of arguments (meta-dispatch?) I can't say if this is a really bad idea or maybe even useful in other cases than T::Concrete. Of course, marking up a type parameter as concrete could be done with some new special syntax without opening up this Pandora's Box.

One thing to note is that f{T<:Concrete}(x::T, y::T) will not serve to mark T as concrete. I guess it has to be an axiom that if A <: T and B <: T then Union(A, B) <: T, so that would allow T = Union(typeof(x), typeof(y)) above. A union is not a concrete type (right?) in the sense that typeof(x) can never return a union type, so apparently concreteness can not be described by a subtyping relationship. But perhaps by an instance-of relationship.

eschnett · 2014-11-13T03:12:44Z

It's not just a question of whether T is covariant or invariant, it's also a question of how T is determined at the call site, i.e. how the pattern matching works. In the case of f{T}(::Matrix{T}, ::T}), T may be forced to be an abstract type by the first argument. In the case of f{T}(::T, ::T), there is nothing that would immediately force T to be abstract.

Instead of arguing via covariance or invariance, I would describe Julia's current type matching rule as "match the (implicitly concrete) argument types against the types in the function signature, and deduce type parameters from that". Nothing in type matching is currently looking for supertypes (or unions) of two types to make things match.

Having said this, which is obviously not news to you: @JeffBezanson, are you looking for a way to introduce such a behaviour (automatically looking for supertypes or unions)? If so, what about allowing type arithmetic in function signatures? Or maybe it would be better to extend type pattern descriptors?

What would be the advantage of deducing supertypes or unions? Would it be to prevent specialization of routines? If so, couldn't that be expressed differently, with metadata instead of through the type system, or by analyzing the function body to see whether specialization is beneficial?

JeffBezanson · 2014-11-13T03:29:16Z

The question is whether a tuple of actual argument types is a subtype of the method signature, so this does reduce to a question of covariance vs. invariance (the argument tuple is covariant).

There wouldn't be any advantage to deducing supertypes or unions; we don't want that. However theory requires them, because (Int,String) is indeed a subtype of forall T . (T,T) if you let T be Union(Int,String). So we need an explanation for why we don't do that. If we did this outside the type system, you wouldn't be able to write down a type for the method that lets you reason about how it can be called, which I think would be unfortunate.

eschnett · 2014-11-13T04:40:34Z

Thanks for the rationale.

I agree with @toivoh: It would then be necessary to express the notion "T is a concrete type" in the type system, since type matching starts from actual arguments, which are concrete types.

JeffBezanson · 2014-11-13T07:28:11Z

Hmm, yes, this may not be invariance exactly, since the intersection of forall T . (T,T) and (Real,Number) should be forall T<:Real . (T,T), while the intersection of forall T . Foo{T,T} and Foo{Real,Number} is empty.

With covariance, the intersection of forall T . (T,T) and (Real,Number) would be (Number,Number).

toivoh · 2014-11-13T10:20:54Z

I'm not sure that I understand the example. For any types R and S, surely their intersection T must satisfy T <: R && T <: S? So I don't see how the intersection of (Real,Number) with anything could be (Number,Number)? With covariance we should have (forall T . (T,T)) == (Any,Any) (right?), so I would expect the intersection with (Real,Number) to be (Real,Number).

I'm not sure how type intersections relate to the problem at hand. As I understand it, what we are discussing is

Under what conditions should a given argument tuple match a given function signature? This seems to be equivalent to asking whether Ta <: S, where S is the type signature and Ta is the type of the argument tuple. Since Ta is concrete, this should be equivalent to S and Ta having a nonempty intersection.
If there is a match, how should the type parameters in the signature be bound? I'm guessing that we are implicitly aiming to find the most specific T that gives a match, which is an optimization problem with type inequality constraints (though straightforward to solve, it seems). I'm not sure that the answer can be formulated based on a single intersection operation.

I guess that the tuple (Real,Number) in the example was meant to stand for the type of the actual arguments? That's confusing because it's not a concrete type.

JeffBezanson · 2014-11-13T10:41:41Z

Yes, I should have said (Real,Number).

Type intersection relates because it has to be consistent with how method calls behave. Things will break if method calls add in extra behaviors that the rest of the type system doesn't know about.

The question is how to conclude !( (Int8,Int16) <: (forall T . (T,T)) ). Currently, we must be implicitly using something like a constraint that T be a concrete type, but I'm not sure how to formalize this.

StefanKarpinski · 2014-11-13T15:27:32Z

One way to think about this, which is what kind of makes them intuitive is that you can work backwards very concretely from actual argument types in a straightforward way – if T is a type parameter and you have x::T then you can immediately take T = typeof(x). Similarly if you have a::Array{T} then you can take T = typeof(a).parameters[1]. This is actually rather different than quantification, which is part of the issue here.

JeffBezanson · 2014-11-13T16:05:17Z

That's true, this is not really quantification. For one thing, subtyping
with bounded quantification is undecidable, and I believe we are using a
decidable variant. The forall is really a big set union; see jiahao's
comment about calling it unionall instead. However by itself this doesn't
yet solve the method parameter issue.

toivoh · 2014-11-17T18:52:17Z

About the case with a type parameter T that is only in covariant position:
From the matching point of view (choosing which method to dispatch to) it should be sufficient to set all such parameters to their upper bounds. So the only question left about handling such a case would be how to pick T once the method has been decided. One way would be to say that T is set to the most general type that works, e.g. its upper bound. Then we don't need to form a common supertype or union. Or it could be made an error.

toivoh · 2014-11-17T19:00:44Z

I think that we should make the constraint that T be concrete explicit, i.e. let the user write it out. I don't think there is any reasonable interpretation that will make taking T=typeof(x) from x::T fall out as a consistent special case. (The question is of course how much the inconsistency hurts.)

Btw, regarding the possibility for f{T::Concrete}(::T, ::T): this is not the first time when people have wanted to be able to specify the type of a type parameter. Consider MyType{N::Int} for instance, this would also be useful at least for other type parameters that are not types.

eschnett · 2014-11-17T20:04:00Z

I don't think that the user cares whether a type T is concrete or not. The only case where this would (currently) matter is e.g. in the type of containers, such as f{T::Concrete}(::Vector{T}). I haven't seen people ask for this.

Maybe one should treat matching and compiling as two separate steps. During matching it does not matter whether a type is concrete; as @toivoh says, the most generic matching type signature could be used.

To compiler, the types are then specialized. This is a heuristic, and exactly how a routine is specialized does not influence correctness. In principle, the compiler could even specialize on the (inferred) type of local variables, i.e. introduce branches to keep values unboxed.

JeffBezanson · 2014-11-17T20:30:37Z

During matching it does not matter whether a type is concrete

Doesn't it, though? We want f{T}(x::T, y::T) to be applicable to (Int,Int) but not to (Int,Int8), so we don't want T==Union(Int,Int8).

eschnett · 2014-11-17T20:58:27Z

I stand corrected.

jiahao · 2014-11-19T17:32:00Z

Ref: discussion about negations in 23b6a1f

jiahao · 2014-11-19T18:54:27Z

Some thoughts on direct field access (#1974 #2614 #7561):

From a practical perspective, languages differ on how they restrict field access:

Some languages provide explicit language constructs providing privacy, like private and protected in C++ classes.
Others provide no true privacy, but rather have a de facto standard for indicating a preference for privacy. In Python, the naming convention is to start "private" variables with underscores, like _foo.
The Python language also provides name mangling, a mechanism to avoid accessing "private" parts by obfuscation, by renaming things like __bar to __classname_bar (ref: Python 2 tutorial, Sec. 9.6).

In Julia we have no true privacy, but afaict we don't have explicit naming conventions or mangling mechanisms either.

From a theoretical perspective, the fact that all fields are semantically public has consequences for the type system:

If the fields are allowed to be typed with type parameters, then the need for type safety restricts the allowed subtyping relations. The conventional wisdom is that if the fields can be read from, then the most permissive safe choice is covariance, and if the fields can be written to, the most permissive safe choice is contravariance. For fields that can be both written and read, then the only safe choice is neither, i.e. invariance.
Python uses duck typing, where type information doesn't help in defining programs - only the behavior of an object is relevant. Type theorists appear quite fond of structural typing, in which subtyping is determined automatically by internal variables (fields). Structural typing seems like a good idea for databases, but also leads to odd embedding behaviors.

Julia uses nominative typing, where subtyping relations are explicitly defined. But there are cases where using nominal typing is problematic, causing users to use duck typing instead. Two examples I have in mind:

Abstract matrixlike objects like QRCompactWY and the discussion in array type hierarchy #987 about whether it ought to be an AbstractArray.
Method ambiguities resulting from loading packages that extend the base library in incompatible ways, as discussed in move ambiguity warnings to run time #6190.

Is it a coincidence that many of the problematic cases encountered involve AbstractArray in some way?

timholy · 2016-04-30T08:27:09Z

When this gets picked up again, one thing worth mentioning: lately, I have been wondering whether a good way to fix many problems with compilation time might be "abstract type inference." The idea would be that when possible, you express the result of type inference on a method as a set of @pure operations on types. That way, in many cases you'd only have to run type inference once and then you look up the answer each time you want to create a specialization.

For example, in running inference on

getindex{T,N}(A::Array{T,N}, i) = arrayref(A, i)

you might like to express the return value as ::T "generically" rather than as whatever specific Twas used for A.

In a more complicated case like

function (+){S,T,N}(::A::Array{S,N}, B::Array{T,N})
    R = promote_op(+, S, T)
    # check sizes...
    out = Array{R}(size(A))
    for I in ...
    end
    out
end

then you'd probably like R to be annotated ::Type{promote_op(+, S, T)}.

I mention this here because this seems to require the ability to express a type in terms of functions.

EDIT: if this could be pulled off, if nothing else it seems likely to speed the test suite: some of the slowest tests, e.g. subarray and the linalg suite, are slow precisely because they run the same methods with many different input types.

martinholters · 2016-05-11T09:56:07Z

Having read through this thread, I'm still not 100% sure of the desired behavior. I got:

f{T}(::T, ::T) should not match (Int,Float64).
f{T}(::Array{T}, ::T) should match (Array{Number}, Float64) (with T==Number).

But

should f{T}(::Array{T}, ::T, ::T) match (Array{Number}, Int, Float64) (with T==Number)?

StefanKarpinski · 2016-05-12T17:31:37Z

If the second matches, I don't see how the third could not match (although I'm not sold on this behavior).

timholy · 2016-05-12T17:40:16Z

I definitely think the second shouldn't match. I like the version in #8974 (comment), and it seems Jeff does too.

martinholters · 2016-05-13T06:44:55Z

Ah, yes, sorry, I had missed #8974 (comment), my bad. Makes a lot more sense that way. Carry on.

datnamer · 2016-06-24T21:39:52Z

So will this improve Union type performance?

simonbyrne · 2016-08-12T15:11:55Z

I came across this recently while using a lot of nested types: I'm not sure if it's something that you're planning on handling or not, but it might be a useful test case:

julia> immutable Foo{T} 
       x::T
       end

julia> immutable Bar{T} end

julia> typealias FooBar{T} Foo{Bar{T}}
Foo{Bar{T}}

julia> FooBar{Float64} <: FooBar
false

It's been a minor annoyance, as I have to write f{T}(x::FooBar{T}) = .... instead of f(x::FooBar) = ....

JeffBezanson · 2016-08-12T16:07:47Z

Yes, the new subtyping algorithm should fix this.

JeffBezanson · 2017-01-16T17:06:46Z

closed by #18457

JeffBezanson added this to the 0.4 milestone Nov 11, 2014

This was referenced Nov 18, 2014

method_exists not working with TypeVar-parameterized types #9043

Closed

== not transitive for TypeVars #8920

Closed

mauro3 mentioned this issue Nov 19, 2014

Subtypes in @traitdef mauro3/Traits.jl#2

Closed

StefanKarpinski mentioned this issue Apr 28, 2016

Feature request: Add syntactic sugar for covariant actual type parameters #6984

Closed

timholy mentioned this issue May 9, 2016

RFC: experiments in diagonal constraints in jl_args_morespecific (ambiguity/method sorting) #16276

Closed

mauro3 mentioned this issue Jul 6, 2016

abstract multiple inheritance #5

Open

JeffBezanson mentioned this issue Jul 8, 2016

Rename isleaftype #17086

Closed

This was referenced Aug 5, 2016

Compound abstract types for function parameters. #17842

Closed

Expression Int{} should be an error #17943

Closed

JeffBezanson mentioned this issue Aug 18, 2016

Partial parameter match leads to confusing MethodError #18110

Closed

jiahao mentioned this issue Sep 2, 2016

Taking vector transposes seriously JuliaLang/LinearAlgebra.jl#42

Closed

This was referenced Sep 12, 2016

WIP: Improve tuple subtyping in invariant position #18451

Closed

type system revision and new subtype algorithm #18457

Merged

StefanKarpinski assigned JeffBezanson Sep 13, 2016

vtjnash mentioned this issue Sep 15, 2016

diagonal dispatch type-inference error #11015

Closed

simonster mentioned this issue Sep 26, 2016

Deprecate vectorized float methods in favor of compact broadcast syntax #18495

Closed

tkelman mentioned this issue Sep 29, 2016

extern keyword (cf conditional modules) #15705

Closed

klkeys mentioned this issue Oct 11, 2016

Conditional load of OpenCL.jl JuliaGPU/OpenCL.jl#120

Closed

mauro3 mentioned this issue Oct 12, 2016

Parameterized methods can shadow other method without error/warning #18892

Closed

JeffBezanson mentioned this issue Oct 14, 2016

Should Union types record whether they came from a defining typealias? #18924

Closed

JeffBezanson mentioned this issue Oct 30, 2016

Dispatch on Pair #19159

Closed

mauro3 mentioned this issue Oct 30, 2016

specialization on function argument not occuring as expected #19137

Closed

JeffBezanson closed this as completed Jan 16, 2017

mauro3 mentioned this issue Dec 19, 2017

examples/juliatypes.jl errors #25183

Closed

type system design iteration #8974

type system design iteration #8974

Comments

JeffBezanson commented Nov 11, 2014

Jutho commented Nov 11, 2014

timholy commented Nov 11, 2014

StefanKarpinski commented Nov 11, 2014

toivoh commented Nov 11, 2014

jiahao commented Nov 11, 2014

JeffBezanson commented Nov 11, 2014

toivoh commented Nov 12, 2014

StefanKarpinski commented Nov 12, 2014

jiahao commented Nov 12, 2014

toivoh commented Nov 12, 2014

toivoh commented Nov 12, 2014

eschnett commented Nov 13, 2014

JeffBezanson commented Nov 13, 2014

eschnett commented Nov 13, 2014

JeffBezanson commented Nov 13, 2014

toivoh commented Nov 13, 2014

JeffBezanson commented Nov 13, 2014

StefanKarpinski commented Nov 13, 2014

JeffBezanson commented Nov 13, 2014

toivoh commented Nov 17, 2014

toivoh commented Nov 17, 2014

eschnett commented Nov 17, 2014

JeffBezanson commented Nov 17, 2014

eschnett commented Nov 17, 2014

jiahao commented Nov 19, 2014

jiahao commented Nov 19, 2014

timholy commented Apr 30, 2016 • edited Loading

martinholters commented May 11, 2016

StefanKarpinski commented May 12, 2016

timholy commented May 12, 2016

martinholters commented May 13, 2016

datnamer commented Jun 24, 2016 • edited Loading

simonbyrne commented Aug 12, 2016 • edited by yuyichao Loading

JeffBezanson commented Aug 12, 2016

JeffBezanson commented Jan 16, 2017

timholy commented Apr 30, 2016 •

edited

Loading

datnamer commented Jun 24, 2016 •

edited

Loading

simonbyrne commented Aug 12, 2016 •

edited by yuyichao

Loading