-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language Discussion #1
Comments
I'd like to chime in that Elements of Programming is one of my favorite tech books, it's got a very prominent space on my book shelf, next to Luca Cardelli's A Theory of Objects and Bird and Wadler's Introduction to Functional Programming (1st Edition). If a language could express the ideas in those three books combined, it might very well be my favorite language. What will your objects look like? Have you studied Self? I highly recommend giving this short paper a read: What will the solution for this problem look like in your language? (I think it's a great litmus test for languages) It shows why just type classes has some problems solved by delegation with differential inheritance (a more powerful superset of subclassing). |
@SimonMeskens I'm not familiar with "A Theory of Objects" but a while back I contributed to this paper https://arxiv.org/pdf/cs/0509027.pdf which shows how object types fit into a formal type system. With objects there is a problem with dispatch in that "a.f(b, c, d)" gives special treatment to the receiving object 'a', and only does dispatch on this. With type-classes we start with a language that has no overloading. If we want to allow overloading we use a type class, and a type class can dispatch on multiple type parameters. I will take a look at the examples in that paper, thanks. |
@SimonMeskens I have implemented an example from the Shark paper in Rust, as it is a similar language (imperative and has traits):
I don't intend to use the same syntax as Rust, but it gives an idea of the structure in a language with type-classes. One key difference with JS as the target is that we have garbage collection, so we don't need to have any lifetimes, which simplifies things (although Rust correctly infers all the lifetimes in this example without any annotation), and also because of this we don't need affine types, and JS always passes by value (objects are values which are references to the object). |
First design questions are how much type inference should there be, and what are the consequences for syntax. Considering the simplest polymorphic function, in a Rust-like style:
Or Haskell style:
The Haskell style has the advantage of working with type inference, and the Rust style has the advantage of automatically introducing scoped type variables. Obviously there will need to be type signatures in type-class definitions, and records / module definitions. |
I think that if the goal of the language is to provide a very simple, but very expressive language, you should consider going for a very simple, but expressive syntax. In that case, I think I like the Haskell syntax more than the Rust syntax. I've never been a fan of how dense the Haskell syntax gets though, so I'd aim to sit somewhere in between. I think Haskell syntax, with words instead of symbols, would work best. If you want, I could give some samples of syntax that might be fun to play around with. I think I'm going to get out Elements of Programming and look up a few good examples and try to create a syntax that is made explicitly to express those problems. Thanks for providing the example in Rust. I think there's a possibility to introduce some concepts to make it more powerful and concise. I assume you'd prefer me to open an issue with a proposal of any specific features? Btw, in the paper I linked, multiple dispatch is more or less equal to type classes, so, as the paper shows, introducing a concept of delegation improves upon the code semantics. I've been playing around with the idea of delegation in Haskell and I think I know where to go with it. I'll move that talk to the other issue then. |
@keean the type signatures of all exported interfaces must be explicitly declared and not inferred, else modularity (separate compilation) is impossible which is critically needed if targeting JavaScript and its dynamic runtime module importing. This also makes the interaction of certain higher-order first-class type features decidable when moving away from the origin on the Lambda Cube, as well enables to implement typeclasses in more than one way (which Haskell can't do due to its global inference). It also enables decidability with first-class anonymous (structural) unions (disjunctions), which is a mandatory feature of the language else I won't contribute. TypeScript, N4JS, and Ceylon (which emits JavaScript) have this feature. Scala 3 (Dotty) also has it (and there is a Scala.JS compiler for Scala 2). Ceylon also doesn't support typeclasses, but Scala 2 (and presumably Scala 3) does. Inferred structural unions are useful for many paradigms and is necessary to support the JavaScript Declaration of type signatures within functions (also methods) should be optional and inferred if not specified. Declaration of non-exported function type signatures could perhaps also be optional and inferred. |
@keean to add to your opening comment of this issue thread, TypeScript is intentionally unsound typing and PureScript lacks first-class anonymous structural unions. N4JS appears to be sound typing, but also lacks typeclasses. |
@SimonMeskens I am interested to see what the code looks like with delegates. I provided the sample in Rust because I can syntax check and run the example, there is always the problem of unchecked errors when posting hypothetical code samples. |
@shelby3 I agree all exported interfaces must have type signatures. For higher-order features I quite like first-class-polymorphism. I think a reasonable assumption is that everything inferred is monomorphic. If non-exported function signatures can be inferred, then a syntax with optional type annotations would seem better. |
@shelby3 regarding first class unions, I think one of the key features is supporting the kind of programming you want to do. I do want to suggest an idea though, and see what the consequences are. The idea is to have no concrete types at all, so all types are expressed by type variables constrained by type classes. Consider the identity function using a cross between Haskell and TypeScript notation:
'A' can be any type, so we could constrain it to be a type-class using a Prolog like notation:
This is already the union of all types that implement Integer, but we can go further and have a type-level or operator:
This has a nice synergy with JavaScript and duck-typing as we are saying that we don't care what type is passed, as long as it provides the correct interface. We can type check this nominally in TraitScript, but it reflects the underlying JavaScript that allows any object to be passed as long as it provides the right methods and properties. The question is, does that give all the functionality that you get with union-types? |
@shelby3 I find myself re-thinking in the morning and finding I don't agree with what I posted yesterday. An or operator for traits does not make any sense, as it is saying the object may have either method A or method B. What we want to know inside a function is if we can safely call method A for example. I don't think there is any need for an 'or' operator, or first class unions outside of dynamic types (aka trait-objects). I think I need to revisit the original expression problem with type-classes. Starting with a collection of trait-objects (existential types):
Requires all objects inserted into 'c' to implement the Integer trait. Now lets say we want to print this as well, we somehow need to inject the 'Show(a)' trait into 'c'. This is actually an 'and' operation, but it occurs in an 'open' way, somewhere else in the source code, something like:
This requires anything calling 'show' has to be able to prove all elements in collection 'c' provide show. This to me suggests that what is needed is some kind of constraint inference. We can easily infer the set of constraints on each element of 'c' as being (Show a, Negate a) or something like that. The question is whether 'c' is heterogeneous or homogeneous, there are two possible type signatures:
In the former case 'show' and 'negate' would need to each have the constraints on '[a]' in their type signatures directly (although they can be inferred). One option would be to allow:
To automatically accumulate the inferred type classes. But what to do on module boundaries? |
@keean I explained to N4JS's @yoshiba why Joose and Object Algebras do not solve Wadler's Expression Problem and are not as orthogonally extensible as typeclasses. |
@keean wrote:
Also remember we discussed in May, that typeclasses (not virtualized typeclasses, i.e. Rust's trait objects) can be monomorphic at the function application use-site because the data type is not lost (as in subclassing where it is assigned to a supertype), thus avoiding the virtual method jump table lookup which otherwise stalls the CPU pipeline reducing performance by as much as to 5 - 6 times slower. |
@shelby3 I have created a second issue here #2 to discuss union types. I am fully aware of monomorphisation, which is when the type is statically determined. There is also runtime polymorphism (which it what haskell uses existential types for, and Rust uses Trait objects. I prefer the existential approach for the type system, as it is actually sound as a logic). |
Also note the mult-method dispatch version of the Sharks code makes some other changes, introducing new types for HealthyShark and DyingShark, I will update the Rust code to reflect this way of implementing it. |
@keean wrote:
ASM.js passes by value unboxed. For functions operating only on numbers, afaics we could support this seamlessly. Emulating other types (e.g. strings) in ArrayBuffers would require more complexity to compile and interopt, and might require more C-like syntax. We could keep in mind these sort of optimizations and whether we want and could reasonably provide this without forcing the programmer to resort to a FFI. No need to discuss this ASM.js tangent right now. |
I think you can write delegation as a monadic structure, which might be the real ticket. I also look forward to your new implementation. I think it's a cool example as it covers such a broad range of language capabilities. |
@SimonMeskens wrote:
The huge salient difference is that multi-methods is dynamically dispatching to the implementation of the subclasses of Wheres, the Rust type-class example shows that type-classes can be resolved monomorphically at compile-time, so there is no jump table lookup. Edit: another huge difference is that type-classes don't bind the multiple interfaces in the class inheritance and instead keep interfaces and classes orthogonal.
@keean wrote:
Changing the interface based on the state of data type, conflates the instance of the data type with the interface. This makes orthogonal extension with for example collections of instances inefficient and boilerplate prone. I don't think this is a paradigm that should be used. P.S. if we do our job exceedingly well (as in demonstrate that type-classes and sound typing kick ass on subclassing), there is a remote possibility we could be designing the future replacement for EMCAScript. So performance should be high on our list of priorities, because performance is very difficult to achieve in a dynamically typed language and virtual inheritance of subclassing messes up performance. Google is looking for inspiration for their SoundScript concept of a soundly typed JS-like language in the browser. The guy leading that effort (Andreas Rossberg) is familiar with Scala, ML, etc.. Andreas commented that N4JS is "very interesting". But I don't know if fast and incremental compilation will be achieved in the language we are contemplating. We can think about this in our design decisions if we want to. |
SoundScript is described here: http://www.2ality.com/2015/02/soundscript.html which sounds like JavaScript plus enforced classical objects and inheritance, and simple function typing. There's nothing about generics or code generation. |
SoundScript's goal is to actually be somewhat compatible with TypeScript, I assume Google intends to make it so we don't have to compile TypeScript for Chrome anymore. |
@keean wrote:
Until they figure out that virtual inheritance stalls the CPU pipeline and type-classes don't. I think we have a very tiny (almost nil) chance to influence the future. Any way, we should design for performance assuming one day our language will have its own VM and not just compiled to JS, if that doesn't incur any huge trade-offs in our design constraints. Also we can think about optimizing some parts with ASM.js. I am not sure how this thought process will play out. I am trying to keep my mind open and see where the design constraints lead us. I agree the priority is on having a simple, clean language without proliferation of duplicated paradigms. Edit: battery life (i.e. performance) on mobile is important. I could see our language potentially becoming very popular for mobile apps.
In the link I provided, they mentioned TypeScript as a possible starting point, but aiming to make it sound. I think they will (or maybe already have) discovered that retaining interoperability with TypeScript code while also making TypeScript sound is probably not practical. Of course I could be wrong about this. |
From now on, in all discussion where I mean to use type-classes, I will use the word @keean wrote:
So I assume by "first-class-polymorphism", you mean we can declare a (function, method, or class) type parameter constrain to a trait (or union of traits) bound, and then I can call functions and methods without knowing the actual type of the data which implements the trait(s)? I presume by "first-class-polymorphism", you don't mean higher-ranked types.
Unpacking this, I presume you mean that inference will never subsume to a subclassing subtype (with virtual inheritance) nor to a virtualized trait object (which also incurs dynamic dispatch and stalls the CPU pipeline)? And this is why we need anonymous structural union types, e.g.:
The inferred type is I am presuming we will choose an "everything is an expression" grammar. Note that the following expression (not the return type of the containing function) has a type of
Agreed. Thus we need the suffixed |
@SimonMeskens wrote:
I think I'd prefer to stay as close to JavaScript as possible in expression syntax while everything becomes an expression (optionally a statement as well). The point is to produce a language that is very easy for everyone to learn. However, I hate noisy code (unless necessary to provide reasonable cognitive load), so I am open to adopting some function level grammar ideas from Haskell, such as:
I prefer consistency of coding style because we are in the open source reality now, where code should be readable the same to everyone. So I would prefer to enforce a chosen style. I realize we might not agree on all of these. Let's discuss. But syntax may not be the most important discussion to have now and first. |
I agree with all of those list items 👍 |
@SimonMeskens cool we finally agree on something 😍 |
Yeah, sorry if I acted like an ass on the TypeScript place, not sure why I get defensive there, I'm a pretty nice guy usually. I also don't have personal beef with you, life is far too short for that 😄 |
I agree with most of that, with some restrictions.
Couple of neat examples. Pattern matching in arguments:
Guards on arguments:
|
@keean I was re-reading your idea to unify the syntax for Realizing that typeclasses are just an interface that is either monomorphized or an implicitly passed object instance of the interface, caused me to see that there is nothing that special about typeclasses nor an instance of a class object which implements an interface. Sum types are a bit weird for an interface or typeclass though. It means the type has to be tagged and the code that accesses the interface has to pattern match on each of the sum cases. But at least it unifies and perhaps there is even some utility to sum type interfaces. I don't see how they unify with Thus I don't like I like And for the typeclass instances of |
It's not so much a sum type, as a record type. A record is like a JavaScript object, it's a collection of labelled properties. You can also think of a record as a "labelled product type". You are correct, a type-class is really just a mechanism for resolving interfaces implicitly. Optional implicit parameters (which Scala has) provide a way of unifying runtime interfaces (passed by value) and monomorphisable interfaces (passed implicitly by type). If we allow polymorphic-recursion then type-classes can be non-monomorphisable, and we have to pass the correct instance at runtime, depending on the type at runtime - this is something I would rather not do, and would suggest parsing as an argument instead. Regarding modules, they are no different than records/objects, except the can in some languages extend them with additional features. Our records would already be parametrically polymorphic, which allows them to be instantiated with type parameters. If our language supports no other module features, then modules and records are already unified. You can see this in JavaScript where function closures returning an object give you simple modules. In this case as module implementations can be passed as first class values and we avoid the need for a separate module type system like ML has (and we don't want subclassing of modules). The only other module features we probably want are associated types (or type families), but we want type-classes to have these too, so you could say a record is a module with no associated types. There is one thing to be careful of here, and that is passing modules by value can lead to typed depending on values (dependent types). I think it is a bad idea to have a little bit of type dependency, other languages explore dependent types, but here they add unexpected complexity. However I think we can use union types here, to avoid dependent types and also avoid weird restrictions on passing modules that will confuse people. If we consider a module with an associated type, say some container that contains only Ints or a different container that contains only Strings, when we pass a container like this and access the values they will be of type (Int /\ String). This is different from the case with parametric polymorphism because the type in the parametric case is in the signature and we can monomorphise the called function into two versions, one which handles strings and one which handles ints, and know which one to call at compile time. So why not use parametric polymorphism everywhere? Well type parameters are 'input' types and therefore need to be specified or inferred. Associated types are 'output' type parameters, and simply having a specific value of a module determines those types. Further if the output types of a module are functionally dependent on the input types, we can monomorphise the output types wherever we can monomorphise the input types. |
I know what a record type is, you don't need to waste the verbiage (perhaps you remember me writing numerous times in our discussions about record types). You didn't understand my point. I meant what I wrote. I meant a sum type. I presume you remember what a sum type is. If we unify |
I like that summary (in the context of what I had written). We should make sure we frame that and use it in our documentation. |
I am not grokking this (I don't know the meaning of the term polymorphic-recursion, although I can guess), especially don't know what you mean in this context by "parsing as an argument" so I have insufficient foundation from which I can easily deduce the meaning of the rest. I do remember we had discussed higher-ranked typing and the issues of functions which require typeclasses being the argument of another function. Is that what you are referring to? I remember you wanted to force us to create a new record instance for each case and have to do boilerplate to integrate two instances which are otherwise equivalent other than their differing record name. I remember that was a complexity aspect that started to make me want to not attempt to add typeclasses at the first iteration of the language. Too much complexity needs to be carefully considered first. |
You are saying the local data and/or function(s) within a function, can be returned wrapped in the publicly inaccessibly closure of the returned public function, which serves as the means for creating private access restriction on the said local data and/or function(s). The said returned function is a constructor of an instance of the module. But afaics by not mentioning a key distinction, you are implicitly conflating passing around type signatures with passing around values, i.e. instances of the data and/or function(s) in those type signatures. I think what you mean is that the public interface of the module is declared outside said constructor function. And the returned object of the constructor function conforms to the public interface type signature of the module. But then what enforces that only that constructor function is allowed to construct instances?
You are saying we can pass around constructor functions as first-class values, so the access restrictions of the module system is a unified feature of first-class functions. But that seems to be incomplete as stated above. |
I'm not grasping what the huge distinction or benefit of associated types are? I can conceptually (not actually in Rust) write this example as follows employing type parameters of the type constructor instead of associated type members: // Without using associated types
fn difference<C>(container: &C) -> i32 where
C: Contains<_, _> { ... } And this one: fn distance<N, G: Graph<N, _>>(graph: &G, start: &N, end: &N) -> u32 { ... } And remember my proposal is except where required for preventing ambiguity, those will be written as follows because all upper-case would be reserved for type parameters (a proposal which it seemed you were not entirely enthused or perhaps begrudgingly accepted): fn difference(container: &C) -> i32 where C: Contains<_, _> { ... } fn distance(graph: &G, start: &N, end: &N) -> u32 where G: Graph<N, _> { ... } One claimed benefit is not having to repeat yourself and that the use-site provides local reasoning (instead of positional w.r.t. to the definition site) of the name of the type variable:
instead of:
Whether we express existentials (i.e. the type exists but “I don’t care” what the type is) with Note there are some issues of decidability of type safety that I don't completely grok yet: https://wiki.haskell.org/GHC/Type_families#Decidability Note regarding the EPFL paper, afaik the type-safety was somehow proved for a later iteration of the DOT calculus. |
Scala Tour says:
Do you know of an example? The only justification I've found is to avoid an exponential blowup of Apparently the converse examples are variance annotations (on type parameters for subclassing) and path-dependent type-checking soundness/decidability, but our language won't have those features any way. Also this as converse example:
|
Paul Graham wrote:
Strings are unboxed, store-by-value Might be useful to unify the interface of The following is the point I've been trying to make to you:
|
I had some conceptual lapses above. We are defining the signature of an optionally parametric polymorphic type which defines a sum and/or product type constructor. The type parameters the operands of the sum and/or product type may or may not be bound to the parent type. We can instantiate that constructor supplying some or all of the operands and/or type parameters. Our instance can also define another said constructor. Repeat recursively. Instantiations contain data but they are not an object that can be accessed until all operands and bound type parameters have been provided. A typeclass is any such parent type which has bound type parameters. A typeclass instance implements a typeclass where all operands and bound type parameters have been provided. Thus a typeclass instance is really an object that has an interface with bound type parameters. Thus we are describing a mixin type and Scala's linearization isn't necessary because there is no subclassing relationship, only composition. Our only subtyping relationship is the subset relationship created by conjunctions and disjunctions. Thus I propose Note I don't think we should have a block indented shorthand syntax for sum types. For those, I think we should declare a base type, then implement that base type for each of the types in the sum. This makes the syntax much easier to express with the indenting instead of brace-delimited.
|
Another thing to consider is that interfaces are types, and their implementations are values of that type. Remember an object can have many interfaces, so we need a keyword to indicate that an object implements a trait:
|
That is what I wrote:
Not only objects, even interfaces can implement other interfaces. I propose the keyword is
But when the typeclass instances are implicitly provided, we don't need the So we could consider instead without an
I remember you had I think proposed
|
That syntax is only necessary when we need block indenting. Otherwise we write that as a constructor call, e.g. Thus I think I prefer for the block indenting to use |
This is not right, a trait is a type, it never implements anything. It's more like it extends the type. Also objects are independent if implementations, so you would declare the object and then declare the traits it implements. These are not interfaces bound to the object like Java interfaces. So i would write the above:
Note: because X is a trait, not a type we may want to use different brackets for the parameters to make it clear it is not a type. |
I think it is important to remember that we want to keep implementation of interface (i.e. typeclass) orthogonal to instantiation of data, otherwise we end up with the early binding of subclassing. For example, in subclassing both a rectangle and a square would be subclasses of an interface which provides width and height. Yet a square only has one dimension, not two. Any interfaces which need to operate on the dimensions of rectange and square need to be customized for each, which is what typeclasses do. We will still have subtyping such as when an interface extends another interface then we can subsume to the supertype (which is why we need read-only references and for supersuming then write-only references). Yet references will not have the type of an interface if we don't provide typeclass objects, so then the only case of subtyping will be the subset relationships created by conjunctions and disjunctions of data types. However, data types could have partial implementation, e.g. if one of a plurality of type parameters is specified and perhaps some instance data hard-coded for that case. Also typeclasses could be partially implemented (aka mixins). |
I'm not sure about this. Type-classes are not types, so no subtyping is involved (they are constraints on types). If we allow typeclasses to be passed as a first class value, then we do have a form of subtyping, but is can be expressed using row polymorphism. In effect we use some functions from an interface inside a function the argument passed to the function would be expressed as:
You would not often type the above signature because it can be inferred, but what is says is that an interface is a record which at least includes Implementations of an interface always have to be complete (things must comply with the specification if they say that they do), but users do not have to use all of an interface, so you can pass extended interfaces in place of unextended ones. |
I prefer not to get a discussion about the merits and drawbacks of row polymorphism at this time (not a priority issue for me right now). We did mention/discuss row polymorphism in the past here, here, here, here, here, here, here, here, and here (this last link contains a post from @keean that links to a research paper claiming improvements for Ocaml's polymorphic variants). The answers on the following are probably relevant: http://stackoverflow.com/questions/15237598/why-doesnt-ocaml-support-record-subtyping More: Quildreen Motta wrote:
|
I wrote on Quora:
Greg Kemnitz wrote:
|
I think it was in this thread where we might have originally talking about the importance of static typing. There are numerous comments on Quora from programmers who think unit tests are counter-productive (thus implicitly agreeing that static typing is necessary for large scale development). Some even argue that Test Driven Development (TDD) is counter-productive… Ryan Cook wrote:
Tomas Novella wrote:
Tamer Aly wrote:
Greg Kemnitz wrote:
Hovik Melikyan wrote:
Imtiaz Mohammad wrote:
Vivek Nagarajan wrote:
Nikhil Nair wrote:
|
This is an important neat idea to make diff algorithms aware of higher-level syntax: https://begriffs.com/posts/2014-04-08-pilgrimage-report-structural-merging.html I think I had also suggested this idea in the past. |
The idea is to create a simple, compact, symmetrical, typed language for generic programming that compiles to JavaScript. The initial reference is "Elements of Programming" by Alexander Stepanov and Paul McJones.
The core concept is to have parametric types, type-classes and type-families, along with maybe higher-ranked-types, higher-kinded-types, to enable type-directed generic programming.
Inspiration is from C, C++ Concepts (type safe templates), Haskell, Rust, Eff, and others.
This is different from TypeScript, because it is based on type-classes, not classical inheritance, different from PureScript because it allows imperitivity and state.
This issue is for the initial design of the language, with other issues being forked as necessary for specific topics like syntax, semantics, type system etc.
The text was updated successfully, but these errors were encountered: