-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orthogonal concepts #14
Comments
Yes, I think that is correct. We basically split objects into three orthogonal concepts, data are effectively object properties, typeclass are inferfaces, an modules control data hiding (privite data etc). Note: Elsewhere I proposed that we can combine the three into a single syntax and generic concept, but lets leave that to one side for now. |
Did you see my edits pointing out that 'module' has two meanings? Should we use another keyword than |
Extension and functors are something ML supports on modules, at the moment I don't think we should. I like the ability to have multiple modules in one file, but effectively modules are able to be separately compiled and linked later. In the simplest form they are like namespaces where you cannot access their private data from outside. A module has to export things to be visible outside, and import the external functions and data in order for them to be visible inside. |
Note I added the idea for a monadic effect system. Well I guess it isn't orthogonal in some sense as its implementation relies I suppose on the other 2 or 3 concepts, but the systemic model for handling of impurity is an orthogonal concept. |
@shelby3 wrote:
I want to raise this conceptualization of solving the Expression Problem to a matrix, so that we can how this plays out on more levels. BackgroundDynamic polymorphism is instituted whenever we bind the interface to an object and refer to the type of that object as the interface; Because whether it be at data type object instantiation for subclassing or typeclass object instantiation (aka Rust trait objects), we then must incur dynamic dispatch because when holding a reference of said interface type at compile-time, there is no static (compile-time) knowledge of the types of data that can be referenced with the said interface type (rather the runtime dynamic dispatch handles the dynamic polymorphism). In subclassing the interface is a supertype and in typeclass objects, the interface is the typeclass bound (constraint) on data types that can be assigned to a reference with said type. So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function. More importantly, we lose the axis of the Expression Problem to add new interfaces (operations) to the data types referenced via an interface type. We can add new data types but can't add new operations. Subclassing binds the interface very early at data type instantiation. Typeclass objects bind later at typeclass object instantiation. And typeclass bounds bind even later at the function call site. So ideally, we wanted to always delay to typeclass bounds, but this conflicts with polymorphism of adding data types orthogonally when dealing with heterogeneous union types. Thus I pointed out that by retaining the union of data types instead of prematurely subsuming them to the typeclass bound of a typeclass object, then we could delay the binding to the function call site. Thus we can continue to add new orthogonal operations to the union type, unlike for typeclass objects. The trade-off is that adding a new type to a union type requires invariant collections (such as a Matrix ConceptualizationSo what we can see is there is a tension with trade-offs between the competing choices of carrying around the specific data types and subsuming to the interface. So both are needed for different situations. Recently @skaller pointed out another axis of this tension. That is in the case where don't want to discard the data types because we want to leave them open to new operations, but we want to dictate that a specific implementation is used for a specific typeclass where ever it might be needed. Apparently OCaml's functors accomplish this by being first-class and one can pass these along with a module and they will morph the module with the specific constraints desired, while leaving the rest of the module unspecialized, and thus through the chaining of functors, one can get the same effective result as early binding for one typeclass but remaining open to extension for other typeclasses. So really what we need is to being able to attaching a typeclass object to a data type instance and pass them around as a pair every where a data type is expected. So in other words, dynamic patching of the dictionary for the dynamic dispatch, wherein if that data type is then applied to a trait object at a call site or trait object assignment, then the prior binding is not overwritten and any additional dynamic dispatch bindings are augmented. So then we no longer view data types and typeclass objects separately. A data type always has a dynamic dictionary attached to it, yet it is augmented at runtime, not with compile-time knowledge. The compiler can only guarantee the required typeclass operations will be available at runtime, but it can't guarantee which implementation was selected, because at runtime when it writes the implementation to the dynamic dictionary of the typeclass object, it can't know if a preexisting one will already exist. So we have lifted dynamic polymorphism to a 2nd order. We have dynamism of dynamic dispatch. The matrix is we have two axes of data type and interface, and a third axis of implementations that cojoin the two. This third axis can't be tracked at compile-time without dependent typing, but the compiler only needs to insure that the required interfaces are present in order to insure soundness. So this works. What this enables is for example @skaller's example of wanting to set the ordering of comparison operations, while leaving the data type open to extension with other typeclasses. @keean's proposed solution also achieves it, but it lacks atomicity of retaining this with the data type. For example, @keean's solution is not generic in type of algorithm that can be applied ex post facto and thus it fails another axis of the tension of solving the Expression Problem. What we will see generally is that higher order dynamism is required to solve higher order tensions in the Expression Problem. |
This is not quite right. We can still mono-morphise even with compile-time interface binding - in fact for the traditional requirement this is a requirement. The point is we can statically know both the interface and the exact type passed, and therefore chose and inline the exact function. Rust does this in all cases exept when dynamic polymorphism in involved. So we can inline and monomorphise the functions when the interface is statically bound and the type is statically know. If either are false (so dynamic binding to the interface, or dynamic typing) we cannot monomorphise in languages like Rust. |
This is exactly what adding functions to an objects prototype does in JavaScript. |
@keean wrote:
No it is not the same. The JavaScript @keean I have decided that it is nearly impossible to try to explain these issues like this. I will need to make detailed code examples and write a white paper. You don't seem to understand me. |
So you want the implementation used to depend on the value of the object not just its type. This is monkey-patching, adding a method directly to the object itself. |
@shelby3 wrote:
It is right. I just haven't been able to communicate my points to you. |
You have written it badly then because
This is exactly the opposite of what is true. We must bind the interface statically in order to be able to monomorphise and we must statically know the type at the call site as well. |
@keean wrote:
If you had understood me, you would understand that I already explained that we don't know the exact type passed, that was entirely the point of binding early erasing the data types from further compile-time knowledge and you only have the trait bound remaining. That you didn't even get that point from what I wrote seems really bizarre. |
We always know the exact type passed, that is the whole point of parametric polymorphism. The only time we do not is if there is what Rust calls a trait-object or what Haskell calls an existential-type involved. |
@keean wrote:
Bingo. Re-read my long comment again. |
So in a traditional system like Rust, if there is a trait object we can never monomorphise, it doesn't matter how early or late the type-class is bound. |
@keean wrote:
The word 'before' does not mean 'at'. It means 'before'. I am referring to subsuming to the trait bound before the call site, such as a trait object. But then I generalize it to partial trait objects and 2nd order dynamism. |
@keean wrote:
In a system where I have generalized it to partial trait bounds and 2nd order dynamism then we still can't monomorphise but we gain the ability to add operations ex post facto. Your reading comprehension my long comment was really not adequate. Was it really that badly explained or is it your mind is ossified and not prepared for a generalization? |
But if you have dynamic polymorphism you cannot mono-morphise. It doesnt matter whether the interface is bound before or at the call site. |
@keean wrote:
Precisely. Early binding of interface requires dynamic polymorphism in order to open to adding new data types to the interface bound, e.g. for collections. Also, you are only considering one of the tradeoffs. You forgot the other one which my long comments states and I have reiterated:
|
But that is not what this sentence says:
This says we cannot mono-morphise if we bind the interface at compile time, when there is dynamic polymorphism. But we cannot mono-morphise at all with dynamic polymorphism, so the logic is wrong. |
@keean wrote:
Correct.
Correct.
What is wrong? You just didn't apparently understand that the goal is not to achieve monomorphism but rather to solve the Expression Problem and keep the extension open to new trait bounds even though we've partially bound to some typeclasses: @shelby3 wrote:
|
The point is that we always pass a data type along with a dictionary, but that dictionary can be incomplete and we don't track at compile-time what may already be in that dictionary. When we are ready to bind some trait bounds at a call site, then we write some code which will add these implementations to the dictionary at runtime (insuring they will be available as required for compile-time soundness), but will not overwrite any that already existed in the dictionary. The point of this is that our caller can set preferential implementations (such as which direction of ordering for |
I think we aren't actually attaching the dictionaries to the data objects. Rather the data objects are tagged ( |
This is the same as asking if the implementations exist from the call-site. In order to add them, they must first exist in the source code, so the compiler can statically check they exist. We can go further, we can ask, from the call site, get a list of all types for which there is an implementation of this function. Now we could put this in the type signature so that the linker knows statically what types it is safe to send to that function. |
Wow you really aren't getting it.
Read again:
|
There is still something odd about the way your description comes across to me, which might be my misunderstanding... If I have a function that calls methods on something:
We can say with 100% certainty whatever type whether dynamic or static polymorphism that is passed to |
@keean wrote:
Agreed. And the mistake in your thinking is you are conflating not knowing if an preexisting implementation will not be replaced at runtime, with knowing at compile-time that all required implementations will be placed into the dictionary as required. But I am repeating myself. I understand the concept is a bit confusing at first for someone not expecting it. I do tend to come of out no where with mind twisting new things. |
This is not as restrictive as you think. You can access runtime polymorphic values using interfaces that are common to all the possible types in the value. |
If you state that a Type is also a value of themself or a value of another Type, then the memory layout is fixed for each compiler, I don't like the fact that compiler can choose their own layouts just like in C++.
A value (instance) of a meta type is a type or even the same meta type.
The meta type "ConcreteType" which contains all the types not containing them self, i.e.
You refering to trait objects, right? They are already runtime polymorphic and not compile time polymorphic though the compiler knows how to operate on them because all implementing types provide the methods known at compile time required by the trait. |
Which is useless, because you cannot perform any operations on a type you do not know the memory layout of. The only operation valid on any type is the identity function. |
This is not true when you have user defined (nominal) types. For each new type the user defined you need a unique tag-id, which we can imagine is an integer. So we can start with Int=1, float=2. When the user does "struct x {Int, float}" we need a unique tag-id to distinguish between that and "struct y {Int, float}". You have to remember that in object systems the object class is a tag-id. The alternative is structural typing, but then you cannot have object-classes, and you would end up with something more like JavaScripts prototype based type system with duck-typing. The other thing to consider about runtime typing and reflection is that it is very slow, look at all the "Go" programs, and look at how much slower the code is when they have to resort to runtime reflection. |
Or you infer the available operations over runtime reflection. |
Only if you know the type. Consider I give to 127 bytes of data. What are you going to do with it? |
A general problem of runtime polymorphism
Different ids is okay, but you can store all types for instance as string into memory or you hash over it for comparison
Probably nothing, but you can read a type as string and infer its components or operation. Edited |
Please refresh |
Only if the type data is stored in memory, and without static types how do we know that this is type data and not an integer or a string? |
We don't. If you expect that your current io input string should be a type you interpret it like that else you throw an exception. |
So even with dynamic types, you need static typing. So let's look at the other way around. Do you need dynamic types? If you allow sum types, where we effectively create a value indexed type list, so then we can use a case statement to split the different types into different execution paths, and we can prove coverage statically, so something like:
So the restriction is we need to know which types we expect to see. Otherwise we can use an existential type with an interface, then we can cope with any unknown type that implements the interface. These two options cover all the practical cases where we can actually do anything useful with the value because we can only write code to process the types we know. |
This alludes to the great misconeption in the world that dynamically typed languages are uni typed, because dynamically typed languages are statically typed but their type checking is partial (optimistic) as they always work with a union like type Any.
No, because you can emulate it over values. |
So whilst it seems you agree with me that we can do everything we want with static types, I don't really get why you keep going on about strings? A type is a graph, you can see this because anything with braces '(' and ')' is fundamentally a binary tree, and if we include recursive types, we end up with types being regular_trees not strings! |
Because you can simulate types created at runtime with strings representing the types, i.e. respresenting the type definition. * Note:* You didn't integrate types created at runtime in your type system as this would mutate your type set. |
The 'any' type is bad in any case, you don't really want it. Note this is different from an unconstrained type variable. As I said you can only perform the identity operation on the 'any' type. Perhaps I should turn this around? What do you think you gain by having runtime types? Can you show me a short motivating example for the sort of programming your are adovacating? |
The any type is nothing different than the union type specified over all possible types in your program.
For instance in game programming. List<Enemy> enemies;
if(random condition)
{
Sub1Enemy se1 = RuntimeReflection.createInstance(SubEnemy1,random args)
enemies.add(se1)
}
...
if(random condition)
{
SubnEnemy sen = RuntimeReflection.createInstance(SubEnemyn,random args)
enemies.add(sen)
}
return enemies We assume the subtypes of Enemy all exists, but the arguments for instantiation and the decision which subtypes are instantiated is randomized. You get the list of enemies in a function and filter the desired subtype of enemies over this list and mutate specific properties of this type of enemy. fun(List<Enemy> l)
{
for(enemy : l)
{
match(enemy)
{
case enemy isa SubEnemyi:
//e.g. access a specific SubEnemyi field
}
}
} Of course you can do this in Haskell via sum types but these are runtime types, too, though they are labeled. Edit: As an example for this I could imagine that create a new subtype of enemy and instantiate it over random choices resulting to random number of fields. |
Or you can define an interface for enemy, and then you could runtime load unknown enemies as binary modules, where you do not know the internal data layout of the state store for the enemy, which I think is a better solution of the above problem. |
Refresh please |
What's the difference and in which part is this better? Edit: What me interests more is what is an interface, a typeclass or something else? |
I guess you instantiate your required interface with different modules. Then your interface type is a (partially) dynamic type. And how do you access fields that were not defined in the interface without runtime reflection? |
You access through the defined API which is the same for all implementations of the interface. What I meant was the 'module' can have its own private data which could be any type. |
Sure all implementations implement the methods of the interface and all those methods can be accessed but what is when I try to access data which is not listed in the interface? |
How do you know there is data there, how do you know how to interpret the data. Really what can you do with data you don't understand, you may find out it's an 'int' but is it a price, a distance, a frequency? |
Runtime Reflection which scans structurally over this value.
For the same reason when you try to search keys overs json structures which don't exists but you tried it however because you have expectations over it. You have to imagine that you have partial knowledge. You know you want a structure with a give field name and a given type or a part thereof but you don't know the complete structure and you don't interseted in the complete structure. |
But to what end? What can you do if you don't know what it is? This is like downcasting (towards the subtype). You have an API for shapes, and each shape has a 'draw' method. Great, you can now draw your shapes anywhere. You start poking around in their private data and get 32.6, is that the radius of a circle, the size of a square, the height of a triangle? Just the number is meaningless, and you cannot do anything with it. Really you should not be allowed to see anything that is not part of the published API, otherwise all sorts of things can go wrong, for example the next release of the module could store it's data in a completely different layout. You end up with very fragile code that is going to break all the time and need a lot of maintenance and rewriting. |
Afaics, if you compile your loadable module you should include the corresponding layout in it which you don't know in the project which will import the module. |
I am trying to nail down in my mind, the fundamental orthogonal semantic concepts of our proposed programming language.
data
typeclass
module
data
with compile-time (statically) bound operations and access control, enabling more sophisticated types.Thus I am wondering if
module
is the most apt name for the concept? I think of a module as a unit of code that is imported separately. We still need to importdata
andtypeclass
, so is this done by files? Isn't each file then a module? If so, what is the apt name for the concept ofmodule
above? I am thinking the above concept formodule
is really aclass
(without the subclassing inheritance anti-pattern). Can'tmodules
extend other modules?Edit: note also OCaml functors and modules.
Edit#2: on typeclasses, I replied to @keean:
Edit#3: @shelby3 wrote:
The text was updated successfully, but these errors were encountered: