-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subtyping #8
Comments
They are the only way (other than subclassing) I can see to get variance in the polymorphism, i.e. we can assign new types into the trait object (Rust's name for an "existential type" with a typeclass bound) without breaking the invariance of the The advantage of trait objects compared to subclasses, is that types that can be inserted into a trait object don't have to conform to a pre-existing subclass hierarchy, i.e. it solves some aspect of the Expression Problem¹ but doesn't solve the scenario that unions solve. I am thinking we can view trait objects and unions as complementary in functionality offering different trade-offs on the multi-dimensional design space.
|
@shelby3 I am thinking along the same lines. As far as I understand it there is no subtyping. With type classes when we say This applies to trait-bounds (type constraints) and trait-objects. Trait-objects are not really a valid type-system concept, these are more correctly called existential types. In haskell if we have an existential data type like this:
The
which is where the term 'existential type' comes from. Generally |
Don't we need to be careful about differentiating between subclassing versus subtyping? ZenScript will have subtyping because it will offer structural unions. An Ceylon has anonymous structural unions, but it doesn't have typeclasses. <truthful hyperbole>Also it has that Jurassic, ossifying, rigor mortise, anti-pattern subclassing paradigm, which will infect with that said virus any program written with Ceylon. Ditto every statically typed language that offers subclassing including Java and Scala (because programmers won't be disciplined enough to not use the subclassing virus in their design patterns).</truthful hyperbole> 😜 Update:
He replied as quoted above after I wrote the above. Please don't use the word 'subtyping' where you really mean 'subclassing'. @naasking is apparently unaware of what the programmer can't accomplish with unions and intersections in terms of the Expression Problem if the language does not have the delayed binding of implementation to interface that typeclasses provide. |
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait Thus we must use a data type to express this: data List a = Nil | Cons a (List a) Or add member property names if don't want to be forced to destructure with pattern matching: data List a = Nil | Cons { head :: a, tail :: (List a) } In ZenScript perhaps:
Or perhaps:
Note afaik, Haskell's But note even though the above is subtyping, there are no virtual methods on data types, thus no virtual inheritance and no subclassing. Of course any function can input a data type, so that is synonymous with a method, except it isn't virtual dispatch.
|
Or, with types
|
As a single type with multiple constructors (tagged union):
I'm still not sure about which keywords we should be using for trait, impl and data. |
@keean wrote:
That seems incorrect. It doesn't tell me how to construct an instance of
And we can add member property names if we don't want to be forced to employ pattern matching to destructure:
|
@keean wrote:
That seems incorrect. I think it should be instead:
Remember the @shelby3 wrote:
Note I'd prefer to write that:
Or much better:
I think the last is best because it remains very similar to Java, yet we change the meaning of what is being implemented from Thinking about a typeclass as a Q: "What is a pluggable API?" Remember we both decided that clarity trumps brevity (e.g. |
Not quite :-) some things to discuss. You have given value constructors round brackets, that seems okay to me.
Normally the arguments to cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:
We don't have to stick to that but it's how I was thinking. This has more problems:
So correcting this:
Note this is still Rust syntax that gives special treatment to the first type class parameter, and I am not sure that is best, but let's have a different topic for that when we have agreed this. |
@keean wrote:
Yeah to differentiate them from type constructors, and because value constructors in the Java-like languages use round brackets (aka parenthetical grouping).
You are repeating what I wrote. I even linked to the Haskell record syntax upthread. However the following is not naming the members of
And to stick with the Java-like syntax (and not mixing in Haskell syntax), I would prefer the following which I think will be much more clear to mainstream programmers coming from popular programming languages:
The following is mixing a JavaScript unnamed
|
@keean wrote:
I had a typo and
The type of the type parameter @shelby3 wrote:
Note in the above quoted text, I was referring to a data type @keean wrote:
That is still incorrect. You have a type parameter @shelby3 wrote:
Follow the link in the above quote to see where you had agreed. In fact, you were the one who explained the issue to me. And now it seems you forget what you explained to me. Actually the above is revealing a deeper issue to me about higher-kinds which I had realized when I woke up this morning. I am preparing to write about that. |
The In C, C++ and Rust we would write:
Which we are writing:
Or
I am happy with either, providing the field names are optional, but I wanted to point out that the data statement can be viewed as an extension of struct and object definition. |
@shelby3 wrote
Yes it does, what you wrote cannot ever end in a Nil. |
@shelby3 this version is correct in Rust syntax:
Cons needs a second type parameter because The trait When we add |
@shelby3 wrote:
I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-) |
@keean wrote:
Yup. Your idea was fundamentally flawed. We can't express a generic That is why I had written (before you commented with the erroneous idea) the correct way to define a generic @shelby3 wrote:
@keean wrote:
You weren't wrong the first time. It makes no sense to specialize the |
Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this. |
@keean wrote:
Seems I recall that in the early days of C, it was only possible to use I forget about |
@keean wrote:
I will need to review this, so I can comment meaningfully. Where may I read the most succinct example which shows how I won't have to specialize the typeclass list for every data type I want to put into the list? I presume a lot of HList boilerplate again? Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript. If you are expecting mainstream programmers to use HList, I doubt it. But I need to review the examples before I can comment not just from guessing. The link above is I think probably instructive about this. |
In Haskell this:
We can look at Peano numbers as another example:
Note the main difference between Haskell type classes an rust traits syntactically is a rust trait has a concept of 'self' but Haskell does not. You can liken this to function syntax:
Likewise with type classes rust makes the first type parameter special, so the Peano numbers above become:
|
@shelby wrote:
.
I expect you are taking what should be an orthogonal concept of a generic list and binding it to the data type in the list, and then using some boilerplate scaffolding to simulate genericity? This appears to be the basic theme of HList concepts as far as I can discern thus far (I may be wrong?), to sidestep a weakness in the type system and simulate type system in code with scaffolding? |
I had started to sniff a problem yesterday. I was starting to realize we probably have an unsolved problem in the design incorporating first-class anonymous structural unions. @keean wrote:
We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell. Thus afaik in the above So afaics, that is not specializing the I presume the same for Rust, but bottom type instead of top. But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems. I suppose you are saying we can simulate heterogeneous lists with HList concepts, but the point of the first-class union was to eliminate that boilerplate and make everything more extensible as I had attempted to explain/discuss at the Rust forum:
It is possible you didn't realize how extensively I wanted to use the first-class unions. Perhaps you were thinking we'd be using HList concepts instead? We have design quagmire now. I am trying to get my mind wrapped around it. I am suspecting we have failure now in my concept, but I need to confirm by understanding this design quagmire fully.
|
Even more edits to my prior comment. I am suspecting potential failure of my design concept. 😭 |
@keean wrote:
Haskell doesn't have any inductive types, thus it doesn't have (the type of) Peano numbers. We have to be careful when using Haskell's coinductive call-by-name type system where laziness-by-default makes non-termination a value, as a model for an inductive type system with eager evaluation strategy by default that adds first-class unions. Many aspects appear to change. |
A couple of things. First I think we should support multi-parameter type classes (with type-families, aka associated types) in full. People do not have to use the full power of this, but I don't want a false ceiling to the abstractions we can build. Haskell has iso-recursive types (not equi-recursive) so it does have a kind of inductive type, thus Peano numbers work in Haskell :-) I can go on to define addition, subtraction, full arithmetic in the type system. Using polymorphic recursion you can even convert a value to a type for faked dependent types, but I don't think we should support this... That why we are using parametric types not universally quantified types. So in our system those Haskell type Peano numbers have to be statically determined (effectively known at compile time). They cannot support runtime polymorphism without combining with existential types. We are discussing different ways to do things, first-class union types give us:
|
Re-summarizing and expounding on what I learned about subtyping from the MLsub paper. |
I thought about all this, and i think i'm exploring what i think come close to a good system. When looking at OOP one of the thing that makes it confusing, is that class can be both equivalent to implementation of an interface, and used to encapsulate a context of execution for those 'methods'. If the concept of 'class' in OOP is replaced by modules, and the functions definition in the modules follow a functional style with immutable input type (optionally an operand), and a result type, there is no need for the 'encapsulation' of OOP's class definition , only the interface definition. Modules can still define a record type, if runtime polymorphism is being achieved via dynamic linking, it means the runtime need to convert the record type used by a module to the type used by the program using dictionaries to find an intersection between the two records types and copying the values from one record to another. Having types informations in modules at runtime can also be useful for serialization / networking, if modules can be used in distributed environment, and necessary if runtime polymorphism is made using dynamic module linking. References to 'dynamic modules' in a program can only be an interface, defined using a typeclass that allow the runtime to check the type of a dynamic module, and relink the program with the functions that can take an immutable record as input or output a new one as a result, copying the fields in and out of the modules to the types used by the program based on dictionary definitions. If the type of the module being used in the program can be determined statically at compile type, the name of the module can be implicit like in haskell typeclass. If a module is being represented as a variable in a program, loading a new module would involve matching the type of the new module with the type of the interface used by the program, and relinking the program with the address of the good monorphized function in the new module, which would be the equivalent of runtime polymorphism. I would still keep the possibility to have dynamic typing for things like closures + lambda function when the type of the variables in the closure are defined close to the function, and can't cross modules interfaces, the closures would be essentially the only form of global variables with dynamic typing. The modules definition as interfaces with a functional style, allow for composability, and can also be represented as an acyclic graph, and used for 'functional reactive programing' / CSP, and also for AI or language processing when the processor need to build a tree of possible path to evaluate and select from, it can easily build the path as a graph of modules interfaces with more typing information than classic neural networks, and i'm sure it can be very useful also to make parser combinators and language processors. |
@NodixBlockchain wrote:
That wouldn't be a contradiction, though, but I know what you mean...
And what you do with fields? In my eyes classes should be structs. Then we have the complementary problem, what we do with the methods inside a class? Either you define methods in the surrounding module and prepend the input tuple of each method with the former implicit class type (This is probably what @keean and @shelby3 seek for).
Well, modules also encapsulate.
No, modules should be values not types, but it is okay for me to merge structs with records, as they seem to be the same concept.
Sounds like structuralism. When you do this? If they overlap or only if they subsume each other?
Yes, what you describe is the use of existentials.
Dynamic typing is using the Any Type with auto generated pattern matching when assigning a value of Any Type to a variable of type <: Any. I also like this in some cases though I don't know what you mean with "types of variables close to the surrounding function".
If modules are first class, then yes. But what you may mean is that hiding modules between interfaces increases modularity and reduces coupling.
I think all you need here is first class functions, you can all do that with classes, too. Further you talk of interfaces and you are right because interfaces can be really seen as types of modules but I would either choose between structs or interfaces as they are the same concept unless interfaces take the role of typeclasses, too.
Do you talk of runtime reflection (RR), isn't needed here, because at the sender's side it is possible to partly serialize modules at compile time (string template) and feed it at runtime with the corresponding values. |
The way i see it, it would not be very far from the way it's done with c++ and virtual function table, if the linking is done like visual studio with an imported function table, it could become very close to virtual method of OOP. Type information can be included in the linking information, it doesn't have to be only matching a symbol name with an address, this lack in the standard ABI in elf or PE is probably what lead to requiring the COM layer above the base DLL ABI to have better dynamic loading with more type informations, which also allow cross language interoperability with a transtyping in the runtime to an internal format at interface boundaries. I'm not sure if it's really type reflection, the way i see it for the moment, the compiler/runtime could have an internal representation of the module, and operations on module reference would not be just storing a value at an address or simple arithmetic, but equivalent of setting the virtual function table of an interface, and the runtime could have access to type information for the purpose of loading module like a dynamic loader would do, but those would not necessarily be exposed to the programming language, much like the virtual function table of a class in C++. If all the modules than can be assigned to an interface are known at compile time, the type checking can be done at compile time and subtyping relationship established at compile time, and the runtime just need to fill the function table of the interface from the module when it's assigned to it. In itself in this case it could be close to haskell dictionary passing. But if it can load new modules dynamically, it needs to have type information for the call site and the module, if both are bound to an interface definition, the runtime can find subtyping relationship between those interface definition, and fill the function pointer table with the module implementation. I would easily see structural typing for record, and nominal for interfaces, as function names can give good hint on the semantic of the function. The subtyping relationship would be different for interfaces and records, but if modules can't store global/permanent references to records, and only read argument and create new record/values, like with functional programming, structural typing would make it easier for the linker to match interface parameters with input from another module, but still keeping name semantic for interfaces definitions. |
@NodixBlockchain wrote:
Afaics, the interface mechanism is much like or the same? as transmitting a vtable additional to a(n) module/object, so I tend to agree, here.
You mean the lack in the C++ Standard ABI because you can insert a new section into elf's section list to encode type information.
Probably, the inspection of the underlying type behind an interface type can be considered as a limited form of reflection.
Yes, dynamic loading in difference to dynamic linking happens at runtime of the program and would require runtime polymorphism/structural matching/runtime reflection to inspect the shape of the included module. I think the most interesting parts of reflection can be covered at compile time, thats the reason why people of dlang are working on compile time reflection, e.g. list all classes in the project, get values from strings and so on, only small set of tasks would involve runtime reflection, mostly for objects/functions/interfaces/types created at runtime. |
@shelby3,
in the Union and Intersections Issue. Further you state:
Did you mention the following problem with your statements: object module1
import Trait
fun()=
traitObject:Trait= 2:Int
module2.fun(traitObj)
object module2
trait Trait
{
...
}
fun(traitObject:Trait)= otherFun(traitObject) //error traitObject doesn't implement Show, but Int does it.
fun(finiteExistential:Int|Float)=otherFun(finiteExistential) //works because Int + Float does implement Show
otherFun(s:S) requires Show[S] = ...
Did you further propose that the caller passing a union value should also implicitly pass typeclass instances of all variant types from the caller (module1) to the callee (module2) or should the instances of the callee(module2) be of interest only: object module1
instance Addable[Int] {(+):T->T->T=...} //module1:Addable
fun(i:Int)= module2.fun(2)
object module2
instance Addable[Int] {(+):T->T->T=...} //module2:Addable
fun(i:Int|Float)= i+i //did we consider module1:Addable or module2:Addable for i:Int here? |
@sighoya I think the implementation mechanism is this: We assign a globally unique type ID, (GUID? UUID? GUT? UUT?) to all types. For statically known types this can be erased as usual. For runtime dynamic types we can tag the type with the GUID tag (instead of a locally unique tag in a sum type). Type class dictionaries can be stored in a hash map from GUID tag to set of dictionaries. When a value with dynamic type is imported (into a module scope where the set of required type classes is closed), it's tag can be looked up. |
can be inferred for:
True, but I would always prefer Intersection types because of the highest degree of abstraction/freedom over Union Types and HRT. The correct signature is: f[S,T] : ((String -> S) & (Int -> T)) -> (S | T) //for f x= (x "hello", x 2) Note that Intersection types represent a specialization of HRTs. And if you forbid function types in general you shouldn't have any issues with intersection types, I think. |
@keean wrote:
But what did you do if the dynamic value imported from module1 to module2 has variant data types Int, Float, String which have different typeclass instances in module1 and in module2 for type classes known in both modules? |
@sighoya Specialisation and scope of instances is a different topic. The type tag identifies they type exactly, so we can recover the type information at runtime, therefore any resolution mechanism we can do statically can be done dynamically. What I would like to avoid is having the compiler generate code for the runtime. I would rather an approach where the compiler checks that statements are sufficiently guarded at runtime. So if we default types to 'Unkown', you must test before using any interface Something like:
Edit: so the question is how could you determine which of two Print interfaces to use? All the type system has to do is to check that there is exactly one implementation of Print in scope inside the 'if' statement. So for example:
so we can use any value based runtime logic to bring only one of the possible typeclasses into implicit scope. |
@sighoya So perhaps to formalise the above, one possibility that would have the desired semantics would be Agda like implicit variables, where functions can be defined to take additional implicit variables. To allow intermediate functions to pass through typeclasses without modification we would need row-polymorphic implicits. We would also need the ability to filter the implicit environment to reduce the implementations in scope. So if you take Agda implicit variables, and combine with row-polymorphism, I think you end up with Algebraic Effects. With effect inference, you would not have to write the effects into the type signature, except where you want to make these things explicit, like in module interfaces. |
True, but I had related it to union types implying the use of subtyping. Maybe another thread is more appropriate.
Hmm, this seems to utilize named instances to disambiguate issues of global coherence. |
@sighoya I think the important point to remember is that typeclasses have dynamic scope. So the instance in scope can be determined anywhere further up the call-stack. This works fine except where the type of a variable is 'Unknown', in which case we must use the type tag or value of the variable to determine the instance to use. If we consider the two methods of selection I mentioned above: if we guard on a type like 'Int' then we would use the same instance we would use if the type were statically determined to be 'Int'. If we guard on a typeclass like 'implements Print' then we still require that whatever type is dynamically passed resolves to a single instance. We could summarise this requirement as we require 'global coherence' of typeclasses for in every expression, but we do not require the same resolution in every expression. In Haskell we require global coherence globally, that is wherever you are in a program the typeclass for a given type must resolve to exactly one instance, and it must be the same instance everywhere. What I am suggesting here is global coherence locally, IE there must be exactly one resolution at every point in the program, but the resolution can be different for every point in the program. |
@keean wrote:
So you prefer dynamic resolution for trait objects and all the types which has instances associated to the typeclass of the trait object.
There is a problem with this view. object module1
instance Trait for Int
fun()=module2.fun(2:Int)
gun()=module2.gun(2:Int)
object module2
instance Trat for Int
show[S](s:S)requires Show[S] =...
fun(t:Trait)=if(s `implements` Show) show(s) else ... //what impl is chosen for Show? The one of module1 or module2
gun(t:Trait)=if(s `implements` Show) (i=s:Int;show(i)) else ... //different to fun above? If you take the instance from In my eyes, it is better that only trait bounds and trait objects have dynamic resolution implying dynamic resolution in This is the reason I'm against dynamic instance resolution for the variant data types in the union type because the variant data types are visible. |
@sighoya Hmm I struggle with examples in Scala, because its a mess, and I don't want to pollute my brain :-) Instead lets think about simple runtime semantics. We pass dictionaries into functions (as implicit arguments) which is an implementation of typeclasses. This has 'dynamic scope' because the caller can choose which dictionaries to pass in. We can see this working for all static cases, because if we statically know the types, we know which dictionaries to pass in. For any function we know the complete list of functions it calls, which means we can always implicitly know the set of required functions for any call. We need this to be inferred, so library functions can pass through dictionaries to callbacks etc. So given a heterogeneous collection we can calculate at link time the set of dictionaries required by any object put into the collection. This solves one degree of the expression problem because we can add any datatype providing it implements the correct set of typeclasses. Alternatively we can know at link time the set of types that are put into the heterogeneous collection, in which case we can call any function on the collection that is defined for all types in the collection. This allows us to solve the other side of the expression problem, because we can implement a new function on the collection by providing overloads for all the types in the collection typelist. Both of these methods work statically and we can imagine combing them, so to solve the expression problem we need to statically know all the types inserted into the list, and all the functions called on the list, which is of course possible to do statically by type inference and typeclass inference. So this is great, the expression problem is solved, except that we must statically know the whole program. Of course extending this to modular compilation is easy, as we just need to know the list of types that are returned by every function, and the typeclass requirements for every argument. So providing our interface definitions carry sufficient information, there is no problem with modular compilation. So the only remaining problem is with runtime module loading. Again we can solve this by implementing the correct checks on dynamic loading. If we are loading a module that applies functions to the values in a heterogeneous collection, then the dynamic loader needs to check that the functions typeclasses are implemented for all possible types that can be in the collection (remember we can determine this statically at link time). If the module inserts values into the collection then the dynamic loader needs to check that all typeclasses that are called on collection values are implemented for the types inserted. To think about this another way, the problems with typeclasses come from the Open World assumption. We can avoid this by moving to a closed world assumption, but the problem is this limits the expression of extensibility. We can recover expression by deferring closing the world until link time by improving our module interface definitions with the extra information outlined above, but this precludes dynamic loading. We can recover dynamic extensibility by having a closed world with extensions. So the world is closed before the dynamic module load and it is closed after. This makes it the responsibility of the dynamic loader to prove the world is closed by checking the interface of the loaded module against the interface/signature type it is being loaded into. |
@sighoya Okay, I think I understand your example now, and the answer is, we cannot tell which module will be used, because:
What we can say is that we must pass |
@keean wrote:
Forgot that show was part of the Show interface.
Does it hold for both module2.fun and module2.gun with variant data type integer? |
To put it more clearly: |
This doesn't make sense to me. The dictionary for
The typeclass constraint (trait bound) |
This is the |
Well that's optimised, and not the "real" type:
This is valid, and |
This is a typical question if lexical entities has priority over/under dynamic entities. To illustrate again the problem: foo::Int->Int->Int
foo x y = x+y is in your language equivalent to: foo :: (R requires Int[R],Num[R]) -> (S requires Integer[S],Num[S]) -> (T requires Integer[T],Num[T])
foo x y = x+y So What about this?: foo2::Int->Void
foo2 a= (b::Int=a; fun(b)) Does it become?: foo2::(T requires Integer[T])->Void
foo2 a = b::(T requires Integer[T])=a; fun(b) or foo2::(T requires Integer[T])->Void
foo2 a = b::Int=a; fun(b) |
@sighoya
This is true even for:
|
@sighoya Thinking about unifying typeclasses and algebraic effects, we might prefer to write:
The reason for this is that effects can be seen as a row-polymorphic monad, so like we write In this way we can also keep the language pure, and encapsulate all state in algebraic effects. https://overreacted.io/algebraic-effects-for-the-rest-of-us/ What is interesting about this article is that it shows how algebraic effects can remove the 'function colour' from sync/async functions in exactly the same way it can from pure/impure functions. |
Union types express a subtyping relationship, but I am unclear as to whether typeclasses (i.e. Rust's traits) do?
If a
trait B
extends anothertrait A
andB
reuses the implementations ofA
, can we assign a trait object that has a boundB
to a trait object that has a boundA
?Seems the answer based on prior discussion is yes. But that is a subtyping relationship, which means we would need to deal with covariance on type parameters both when they are trait objects and when they are unions. Correct?
Prior discussion:
#6 (comment)
#1 (comment)
#2 (comment)
#1 (comment)
The text was updated successfully, but these errors were encountered: