Support child class fields that reference parent class param/type fields #8018

bradcray · 2017-12-13T19:34:57Z

It seems that a hole in the current initializer design is caused when child class fields refer to type/param fields in a parent class. The catch-22 is that the parent class type/param fields are not set up until the super.init() call is made, yet the child class fields are intended to be initialized before making that call. For example:

class Parent {
  param rank: int;
  ...
}

class Child: Parent {
  var bounds: rank*int;

  proc init(param rank: int) {
    // compiler's attempt to initialize `bounds` will occur here, which requires knowing `rank`
    // yet rank is not known until the following call is made
    super.init(rank);
  }
}

This order of operations is a little artificial because in actuality, the compiler is going to resolve all the generic fields at compile-time before the program even starts running. This suggests to me one of two possible broad approaches (which, admittedly, have come up in discussions with users about initializers):

Since the compiler has already computed the value of rank before any execution-time code runs, have it do the work necessary to permit references to the generic field in phase 1 of the child's initializer. I.e., don't change the language, make the implementation smarter.
Introduce an explicit phase 0 to the existing phase 1 and phase 2 initialization phases of the compiler where phase 0 results in ensuring that all the generic fields are calculated. Again, this would happen in compile-time, but might be made manifest in the code explicitly by having separators between phases 0 and 1 as we do today between 1 and 2. Though I don't actually like the details of the following proposal, imagine something like:

class Child {
  proc init(param rank: int) {
    // phase 0 is here and would establish any generic `type`/`param` fields in `Child`
    super.typeInit(rank);  // here ends phase 0: set up the parent's generic fields
    // phase 1 is here and would establish any normal fields in `Child`
    super.init();  // here ends phase 1: set up the parent's normal fields
    // phase 2 is here: do whatever you want
  }
}

Or maybe in the above, the call to super.typeInit() should come before the Child class's local generic initialization (so that a child's generic fields could depend on a parent's values?)

The text was updated successfully, but these errors were encountered:

bradcray · 2017-12-13T19:48:19Z

I'd be curious to learn what Swift does here.

lydia-duncan · 2017-12-13T20:00:22Z

We should specify that the problem is with field type declarations. Field declared initial values can be sidestepped with the appropriate value that would be used.

lydia-duncan · 2017-12-13T20:02:42Z

I've also debated whether this implies that we should have call to Phase 1 of the parent type be separable from Phase 2 of the parent type, and from the division between the phases itself. When we discussed initializers, the alternate proposed syntax would permit this easily:

  proc init() {
    super.init();
    ... // Phase 1 code
  } finalize {
    ... // Phase 2 code
    super.init(); // or an alternate syntax for the Phase 2 call
  }

This way, either strategy (parent fields before child fields, parent fields after child fields) would be equally supported without making it dependent on compile time versus execution time fields, meaning that writing the program will be more comprehensible (though with the negative aspects mentioned in our discussion of the syntax choice).

bradcray · 2017-12-13T20:56:05Z

I'd be curious to learn what Swift does here.

Oh, I think the answer to this question is "angle brackets." :'(

bradcray · 2017-12-13T20:59:09Z

We should specify that the problem is with field type declarations. Field declared initial values can be sidestepped with the appropriate value that would be used.

I think it's field initializers as well, isn't it? I.e., replacing the bounds declaration above with:

  var bounds = rank*2;

yields a similar error (as it seems it should/would).

lydia-duncan · 2017-12-13T21:06:17Z

You get the same error (as it should), but you can explicitly give the field a different value, e.g.

proc init(param rank) {
  bounds = rank*2;
  super.init(rank);
}

Which isn't possible for the declared type.

lydia-duncan · 2017-12-13T21:06:42Z

But I have debated about making omitted initialization "do the right thing" as well, if that's what you're wondering

bradcray · 2017-12-13T21:09:25Z

but you can explicitly give the field a different value

Well sure, but you can't not give it a value and have it work as it should. I.e., I don't think the problem is strictly about field type declarations.

lydia-duncan · 2017-12-13T21:36:04Z

Well sure, but you can't not give it a value and have it work as it should.

I guess I view the main problem as "this gives you an error message and you can't do anything to fix it in your initializer", which only applies when the problem is with the declared type. But it sounds like the main problem for you is "I can't use an inherited field in the declaration of a child field without getting an error message if I rely on its omitted initialization". Which is fair (and worth discussing), but was an intentional design decision rather than something we forgot about.

noakesmichael · 2017-12-13T21:48:47Z

There are elements of this conversation that remind me of a recent “error” that Nick filed against initializers (#7938). He was implicitly thinking that a type/param field with a default initializer could be referenced sooner than the compiler currently thinks it can. However his example did not involve inheritance. It’s clearly a challenging issue but I have always been wondered about a) The decision to treat type/param fields “the same” as value fields within the body of initializers. The former are compile-time properties that serve to define a concrete type while the latter are fields that are bound instance-by-instance at execution time. b) The manner in which the “initializer” for a generic type is serving two masters i.e. providing “code” that is evaluated at compile time to generate a concrete type, and as the “template” for generating the required initializer for the resulting concrete type.

bradcray · 2017-12-13T22:37:04Z

but was an intentional design decision

I think it's a design decision that needs to be revisited then.

noakesmichael · 2017-12-13T22:44:32Z

I am not quite sure it’s helpful to say “intentional design decision”. Some design decisions were made “with intent” but there were situations in which options were debated without the benefit of a clear understanding of the possible consequences for typical applications. Hence the importance of getting a work-able implementation and then working through user-facing applications plus the module code that supports the implementation.

lydia-duncan · 2017-12-13T22:51:28Z

Alright, then the question we will discuss is "should an inherited field be accessible in Phase 1 of the child?"

I'd like to know what we think should be on the table for this re-evaluation.

Param and type fields only?
The type of any inherited field? (i.e. should var y: inheritedField.type; be a valid declaration?)
The value of var and const fields?

Personally, I think we should strive for a unified policy on all the cases above - I think it is worthwhile to preserve the unity of the previous approach, as it will simplify the mental model for the user. But that's just my opinion.

bradcray · 2017-12-14T01:30:54Z

There are elements of this conversation that remind me of a
recent “error” that Nick filed against initializers (#7938).

I agree that there are some similarities, but I also think that different conclusions could be reached for them, potentially. One interpretation of @nspark's issue under the phase 0 notion proposed above might be "none of the class fields are known until phase 0 has run, so the reference to idxType in the formal argument list doesn't make sense because it's a use-before-def (assuming that "the end of phase 0" is "somewhere within the body of the initializers."). I think a challenge for a unified approach which tried to extend phase 0 to be complete prior to evaluating the formal arguments is that it seems you could write unstable programs in which the definition of phase 0's assignments depended on the arguments and their types and vice-versa.

the question we will discuss is "should an inherited field be accessible in Phase 1 of the child?"

The primary question I'm interested in seeing discussed by opening this issue is "Should an inherited field that has already been computed at compile-time (and is guaranteed to be by the language) be available in phase 1 of a child initializer at execution time? And if so, how does that impact the current initializer design?" (where I strongly believe that the answer to the first question needs to be "yes" — otherwise there's not much value in making it a compile-time field or inheriting from a generic class).

noakesmichael · 2017-12-14T02:31:39Z

> There are elements of this conversation that remind me of a > recent “error” that Nick filed against initializers (#7938). I agree that there are some similarities, but I also think that different conclusions could be reached for them, potentially. One interpretation of @nspark<https://github.com/nspark>'s issue under the phase 0 notion proposed above might be "none of the class fields are known until phase 0 has run, so the reference to idxType in the formal argument list doesn't make sense because it's a use-before-def (assuming that "the end of phase 0" is "somewhere within the body of the initializers."). I think a challenge for a unified approach which tried to extend phase 0 to be complete prior to evaluating the formal arguments is that it seems you could write unstable programs in which the definition of phase 0's assignments depended on the arguments and their types and vice-versa.

I agree that there are differences and potentially more challenges for Nick’s case. And I agree that one doesn’t need to unify these two cases. However 1) The cognitive slip seems similar to me 2) One might argue that Nick’s initializer does not modify idxType within its body and “so” the compiler “knows” that this particular initializer is only applicable for those concrete instantiations where idxType is certain to be int. This would remain true even if there are other initializers that do assign idxType and hence generate additional concrete types. This doesn’t feel entirely different from trying to develop a robust definition/implementation for phase0.

> the question we will discuss is "should an inherited field be accessible in Phase 1 of the child?" The primary question I'm interested in seeing discussed by opening this issue is "Should an inherited field that has already been computed at compile-time (and is guaranteed to be by the language) be available in phase 1 of a child initializer at execution time? And if so, how does that impact the current initializer design?" (where I strongly believe that the answer to the first question needs to be "yes" — otherwise there's not much value in making it a compile-time field or inheriting from a generic class).

I continue to think this challenge is at least partly driven by the effort to use a single block of code to describe both the steps to determine the concrete type at compile-time and to initialize execution-time instances of a particular concrete type. In the context of observing that the current implementation may reduce the leverage for inheriting from a generic class in some cases, it is tempting to consider alternative implementations. I had wondered, in some ill-defined way, about this need when we first talked about this approach to generic types. One challenge might be to develop a better implementation that provides more power/flexibility using flows that be described to users, that behaves in a reasonably predictable manner, and that emits reasonable error messages when the user has generated a case that the compiler cannot handle/understand.

bradcray · 2017-12-14T03:07:15Z

I agree that there are differences and potentially more
challenges for Nick’s case. And I agree that one doesn’t
need to unify these two cases. However

The cognitive slip seems similar to me

I agree-ish.

One might argue that Nick’s initializer does not modify idxType
within its body and “so” the compiler “knows” that this particular
initializer is only applicable for those concrete instantiations
where idxType is certain to be int.

This would remain true even if there are other initializers that
do assign idxType and hence generate additional concrete
types.

A case that made (makes) me think that extending phase 0's evaluation to include the formal argument list would present instability challenges is the following which seems halting-problem-esque:

class C {
  type idxType;
  var x: idxType;
  proc init(x: idxType) {
    if (x.type == real) then
      this.idxType = int;
      this.x = x: int;
    } else if (x.type == int) then
      this.idxType = real;
      this.x = x: real;
    }
  }
}

This doesn’t feel entirely different from trying to develop a
robust definition/implementation for phase0.

I agree. I was really just trying to say that I am skeptical that phase 0 could extend as far as the formal argument list due to cases like the above (but could easily be missing something).

I continue to think this challenge is at least partly driven by the
effort to use a single block of code to describe both the steps
to determine the concrete type at compile-time and to initialize
execution-time instances of a particular concrete type.

I think that may be true, but don't want our solution to be "angle brackets!"
I am comfortable with more of a phase 0 (compile-time initialization)
delineation within a block of code or within a special block of code.
For me, Lydia's example falls into that category as well.
I'd also be comfortable with an approach in which the compiler simply
treated compile-time code differently without requiring the user to
demarcate it. These two approaches are what I was trying to illustrate
at the outset (I'm open to others as well, but don't have any in hand
personally).

noakesmichael · 2017-12-14T04:20:37Z

On Dec 13, 2017, at 7:07 PM, Brad Chamberlain <notifications@github.com<mailto:notifications@github.com>> wrote: I agree that there are differences and potentially more challenges for Nick’s case. And I agree that one doesn’t need to unify these two cases. However 1. The cognitive slip seems similar to me I agree-ish. Also-ish for me. 1. One might argue that Nick’s initializer does not modify idxType within its body and “so” the compiler “knows” that this particular initializer is only applicable for those concrete instantiations where idxType is certain to be int. This would remain true even if there are other initializers that do assign idxType and hence generate additional concrete types. A case that made (makes) me think that extending phase 0's evaluation to include the formal argument list would present instability challenges is the following which seems halting-problem-esque: class C { type idxType; var x: idxType; proc init(x: idxType) { if (x.type == real) then this.idxType = int; this.x = x: int; } else if (x.type == int) then this.idxType = real; this.x = x: real; } } } To a human it’s easy argue that this is different from the Nick case because this one does include code that writes this.idxType in the body and also rely on that type in the formals list. So this demonstrates an example of “so how can the develope be confident about how smart the compiler is?”. This doesn’t feel entirely different from trying to develop a robust definition/implementation for phase0. I agree. I was really just trying to say that I am skeptical that phase 0 could extend as far as the formal argument list due to cases like the above (but could easily be missing something). It’s easy to agree with you esp. if “phase 0” includes additional syntax for demarcation etc. Maybe not so much if it’s about re-defining how the compiler processes param/type fields when walking the current forms of initializers. The case above “obviously” has a circularity that the compiler can try to to talk about and the Nick case clearly doesn’t. But I imagine we could invent steadily harder cases that get harder and harder for the user and/or the compiler to have a good understanding of. And I think you generally advocate not relying on a “sufficiently smart compiler”. I continue to think this challenge is at least partly driven by the effort to use a single block of code to describe both the steps to determine the concrete type at compile-time and to initialize execution-time instances of a particular concrete type. I think that may be true, but don't want our solution to be "angle brackets!” “angle brackets” could cover a lot of ground. From the literal C++/Swift sort of thinking to some other way of separating processing param/type at compile time vs. fields. But I think trying approaches that avoid some version of demarcation tends to push towards “additional compiler evaluation” of the current broad syntactic rules. But I remain keen to think about some options and I think Lydia does too modulo the implementation challenges. I am comfortable with more of a phase 0 (compile-time initialization) delineation within a block of code or within a special block of code. For me, Lydia's example<#8018 (comment)> falls into that category as well. I'd also be comfortable with an approach in which the compiler simply treated compile-time code differently without requiring the user to demarcate it. These two approaches are what I was trying to illustrate at the outset (I'm open to others as well, but don't have any in hand personally). — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#8018 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACERphgZYo9W13-18CowN6LS5yH_Nek1ks5tAJDkgaJpZM4RBENx>.

bradcray · 2017-12-14T15:33:39Z

the following which seems halting-problem-esque:
...

Waking up this morning, I'm wondering if that example would not have been legal due to its assignment of fields within conditionals? In which case, maybe I can't create an unstable example...

I also spent more time musing about Lydia's example and am trying to recall what the advantages were for putting super.init() at the transition point between phase 1 and phase 2 rather than at the head of the initializer? One answer would be "to distinguish between the phases", but it seems like there were others as well?

lydia-duncan · 2017-12-14T16:08:15Z

I agree that any Phase 0 would have to be separate from the handling of the argument list.

lydia-duncan · 2017-12-14T16:11:29Z

@bradcray - one of the arguments in support of the super.init()-as-transition-point strategy (as opposed to the two-code-blocks strategy) was that it allowed local variables to be shared across the phases. I think in practice we haven't felt the need to use that functionality as much, though, so maybe it isn't a good argument for sticking with the current strategy.

lydia-duncan · 2017-12-14T16:18:58Z

Additionally, it is trivially easy to enforce that the same super.init() or this.init() call is made to impact both bodies, and we could surround the call with a compile-time conditional without having to duplicate it.

Also, the alternate syntax could allow the finalize block to be dropped but not necessarily the Phase 1 block (at least, it wouldn't be as visually easy), making it more supportive of Phase 1 as the default. I do have a long list of initializers that would probably benefit from Phase 1 as the default, though, so that also might not be as strong a reason to keep with the current strategy.

A lot of these decisions on initializers seem to be intertwined or linked.

bradcray · 2017-12-15T07:00:20Z

Here's my current initializer counterproposal w.r.t. this challenge after stewing on it for a few days, and taking into account the experiences of the past year, but I'd be curious what flaws others see in it (besides the fact that it's different than what we have today. My hope is to come up with something that addresses current perceived flaws and which is, for the most part, something that we could develop a script to mechanically translate to from today's code):

In English:

the first line of a [class] initializer may be a super.init(...) call that invokes a parent class initializer; if it is not an implicit super.init() will be inserted by the compiler and the parent class must be callable with a 0-argument initializer (Object) will be. Records have no inheritance, so no need of this call (explicitly or implicitly)
for a class, after the above call, the class ID of the object is set to the parent class. (Thus in an object hierarchy Grandparent::Parent::Child, the object will start as an Object, become a Grandparent, a Parent, and a Child as the initializers progress. More on that below.
the subsequent lines correspond to phase 1 today, and must share similar rules to phase 1 except that they may refer to parent class fields since super.init() has already completed. We might also consider permitting class initializers to call methods and/or functions, keeping in mind that the type of this would be the parent type and would resolve/dispatch only on that basis.
the end of phase 1 is demarcated by ...something... but not using curly brackets. It could either be a built-in placeholder function call (e.g., finalize() or initDone() or whatever good name we can come up with) or a standalone keyword-based statement. The goal of avoiding curly brackets is the same as in today's world: to avoid introducing new lexical scopes (or the implication of them). Also to keep the body of the initializer monolithic.
this transition in phase also reflects when the class ID is changed from the parent class to the child class such that any subsequent actions would have full access to the fields. For records, it represents the point at which methods/functions can be called on the record at all (as today)
in the absence of this demarcation, I think I'd say that the body of the initializer is "phase 1 by default" based on experience living in a phase 2 by default world this year, and the fact that the superclasses would already be initialized at the outset. But I haven't thought about this point for very long yet.
phase 2 is much like today.

In code, and supporting the notion of transitioning the class type at the demarcation point:

class Parent {
  param rank: int;
  ...
}

class Child: Parent {
  var bounds: rank*int;

  proc init(param rank: int) {
    super.init(rank);
    this.foo();  // calls as though `this` was a Parent
    bounds = ...;
    bar(this);   // calls as though `this` was a Parent
    fieldsComplete();  // this is the transition call... the cue the compiler uses like it does super.init() today
    this.baz();  // calls as though 'this` was a Child
  }
}

nspark · 2017-12-15T18:02:53Z

I thought I'd chime in with a few thoughts/observations, some of which may wander out of the main thread of this discussion.

If I understand @bradcray's (counter-)proposal and the current state of initializers correctly, the parent-child initilization order changes in Brad's new scheme. I think it goes:
- current initializers: Child.init() ➡️ Child: Phase 1 ➡️ Parent: Phase 1 ➡️ Parent: Phase 2 ➡️ Child: Phase 2
- Brad's proposal: Child.init() ➡️ Parent: Phase 1 ➡️ Parent: Phase 2 ➡️ Child: Phase 1 ➡️ Child: Phase 2
I'd prefer the Phase 1 ➡️ Phase 2 transition to happen without a magic function name; e.g., fieldsComplete() in Brad's proposal. I think a method on this seems more appropriate; e.g., this.fini(). Such a method should be permissible to define and overload or even to call outside of the initializer. It just happens to serve an additional, special purpose within the initializer.
I think Phase 1 by default makes sense. If they're not explicitly added, the compiler can add calls to super.init() and this.fini() at the beginning and end of the init() function body.
One argument I'd make for angle brackets would be to make the scope of rank clear in the member variable declaration of Brad's example:

class Child<param rank: int>: Parent<rank> {
  var bounds: rank*int;
  ...
}

Alternatively, accessing rank from Parent could be required to be explicit:

class Child: Parent {
  var bounds: super.rank*int;
  ...
}

All in all, I think Brad's proposal makes sense with my strong preference for including (2) from above.

bradcray · 2017-12-15T21:36:44Z

Replying to some of Nick's points:

If I understand @bradcray's (counter-)proposal and the current state of initializers correctly

I think you got it.

I think a method on this seems more appropriate

I'm open to that as well. At present, I don't have any real opinion as to what the separator should be.

...If they're not explicitly added, the compiler can add calls ...

I should've called out that the separator could be added implicitly in the phase 1 by default world.

One argument I'd make for angle brackets

I'm still strongly opposed to angle brackets... :D

mppf · 2017-12-15T23:20:25Z

I'm coming a little bit late to this party, but one thing does occur to me.
If we want the program Brad described at the outset to work, 2 strategies in particular appeal to me:

Initialize parent fields before child fields (so that 'rank' being param doesn't impact the fact that it's initialized in the parent and visible in the child initializer).
Initialize compile-time known things in advance, in a separate trip up and down the inheritance hierarchy.

I think that Brad's proposal is some combination of these, but that other variants of it are possible. For example, if all that was really necessary was to get parent fields initializing before child ones, perhaps there is an even simpler proposal?

One interesting thing is that in D, the boundary between Phase 1 and Phase 2 happens once all the fields are initialized. So there might not even need to be an explicit boundary. We might want one anyway, of course.

I worry about trying to handle all the "compile-time" things at once, since we have runtime types for arrays. In particular, there needs to be a runtime path for data to flow as the runtime representation of the type (i.e. the array bounds).

class Parent {
  var A; // array type
}

class Child: Parent {
  var OtherArray: A.type;

  proc init(n: int) {
    // compiler attempts to initialize OtherArray here

    var parentArray: [1..n] int;
    super.init(parentArray); // "type" ie bounds of OtherArray established here
  }
}

I believe that we drew inspiration for initializing child fields before parent ones from Swift. I'm not sure that's the right choice for us, though, given the way types can have runtime components or (or the way our language allows it to look like types/params work at runtime, including with runtime variables of generic type).

bradcray · 2017-12-15T23:33:55Z

I believe that we drew inspiration for initializing child fields before parent ones from Swift. I'm not sure that's the right choice for us, though

Swift also arguably has an initial run up the hierarchy by way of its angle brackets that we don't have in the current implementation, right?

One interesting thing is that in D, the boundary between Phase 1 and Phase 2 happens once all the fields are initialized.

I believe we've kicked around proposals in which the boundary was implicit, but decided to start with an explicit one first in order to simplify the compiler's job, to make the distinction highly visible to users, and because it seemed easier to relax the requirement over time than to add it in. I continue to find this "implicit transition" intriguing but continue to feel more secure keeping it explicit for now.

mppf · 2017-12-15T23:36:12Z

Initializing child fields first has the advantage that a parent initializer can always call a virtual method in its Phase 2. That's not true in Brad's proposed alternative, for example. (Or in C++, as far as I know).

I think any plan for changing the child-to-parent order should have a strategy for dealing with virtual method calls. I think Brad's proposal basically disallows these by adjusting the object type during initialization.

One "food for thought" idea: could we allow child fields to be initialized before super.init(), even if they didn't have to be? Maybe there is a way to develop this idea into something that allows developers to opt-in to Swift like functionality in this area.

mppf · 2017-12-15T23:37:00Z

Swift also arguably has an initial run up the hierarchy by way of its angle brackets that we don't have in the current implementation, right?

Yes, I agree about that.

edit: I think it'd be interesting to find a solution that had parent-before-child for type/param fields and establishing the type of fully generic fields in Phase 0, and then child-before-parent for other fields in Phase 1. I'm not sure yet if Brad's proposal does this or doesn't.

bradcray · 2017-12-16T01:29:39Z

Initializing child fields first has the advantage that a parent initializer can always call a virtual method in its Phase 2. That's not true in Brad's proposed alternative, for example. (Or in C++, as far as I know).

This was intended to be supported by my proposal — this was the bit about setting the class ID to the respective class ID of each class as that stage completes its phase 1 initialization (so for awhile the class will be an Object, then a Grandparent, then a Parent before finally being a Child). The dynamic dispatches can be written and can occur, but will only resolve to ancestors in the tree that have completed phase 1, never to that of the final child class (at least, until its own phase 2 is reached). That seems natural, safe, and like it should "just work" to me, which is part of what I liked about the proposal. Do you think it misses something?

mppf · 2017-12-16T01:48:32Z

What if we went all the way up & down the inheritance hierarchy initializing types and then did Phase 1 and Phase 2 as we do now?

E.g.

class Parent {
  param rank: int;

  proc init(param rank: int) {
    this.typeinit(rank);
    // phase 1
    super.init();
    // phase 2
    writeln(getNameDuringInit());
  }

  proc getNameDuringInit() {
    return "Parent";
  }

}

class Child: Parent {
  var bounds: rank*int;

  proc getNameDuringInit() {
    return "Child";
  }

  proc init(param rank: int) {
    // super.typeinit must be 1st statement in initializer
    // super.typeinit establishes all type/params all the way up the heirarchy
    // parent types/params are established before child types/params
    super.typeinit(rank);
    // after super.typeinit we are in Phase 1 as today
    bounds = ...;
    super.init();
    // phase 2 statements
    this.baz();  // calls as though 'this` was a Child
  }
}

Main open question about this: is super.typeinit always required?

edit: this isn't really different from hat Brad proposed in the issue description...

mppf · 2017-12-16T13:00:55Z

Here is an example of a program that benefits from Swift-style Phase 2 that can call child methods (and it's arguably a variant of the above example, I just wanted to post it separately to be clear).

class Parent {
  var name:string;

  proc getNameDuringInit() {
    return "Parent";
  }

  proc init() {
    super.init();
    name = getNameDuringInit();
  }
}

class Child: Parent {
  proc getNameDuringInit() {
    return "Child";
  }

  proc init() {
    super.init();
  }
}

var p = new Parent();
writeln(p.name); // outputs "Parent"
var c = new Child();
writeln(c.name); // outputs "Child"

To do this in Parent-Child field initialization order requires code creating the objects to do something like this:

var c = new Child();
c.setup(); // finish setting up the c object
           // possibly calling virtually-dispatched methods in the process

mppf · 2017-12-16T14:07:16Z

I don't think the typeinit strategy I described solves the problem well enough in the presence of runtime types. The idea was to treat types differently, but a runtime type can depend on a regular variables.

For example:

class Parent {
  var n: int;
  var A:[1..n] int;
}
class Child : Parent {
}

Now n must be initialized before the (runtime) type of A is known. That implies that we need do one of the following things:

Stop having runtime types.
Treat runtime types differently from types for initialization purposes (i.e. something like typeinit could establish the compile-time portion of the types but not the run-time portions). This might include disallowing the above pattern.
Change the field initialization order.

(1) It seems to me that runtime types are an appealing part of the language especially as a way to enable generic programming with arrays.

(2) Treating runtime types differently seems fraught to me. I think the language is generally designed in a manner that tries to avoid making runtime types require different code from compile-time-only types. Of course there are plenty of bugs in this area...

(3) I think changing the field initialization order might work. What would we need to change it to?

Supposing that each class type up the inheritance diagram has Phase 1 initializing its fields and Phase 2 that completes initialization with the ability to call methods on the full object. Then we could have the following order which would support even the above pattern.

Parent Phase 1
Child Phase 1
Parent Phase 2
Child Phase 2

I believe somebody proposed this when we were designing the current initializers. I don't think we had these examples in mind when we made the current choice.

What implications would this new order have for the syntax? I can think of three strategies:

Strategy One : weird control flow

class Parent {
  param rank: int;
  var x:int;
  proc init() {
    // we could insist on a super.init() call at this point

    // Phase 1 begins
    // method calls not available (or operate only with object type)
    x = 1;

    yield to subclass init; // Or something to mark the Phase 1 - Phase 2 transition point

    // Phase 2 begins
    // method calls now available on fully initialized object (with Child type)
  }
}

class Child: Parent {
  var bounds: rank*int;

  proc init(param rankArg: int) {
    super.init(rankArg);

    // Phase 1 begins
    // method calls not available (or operate only with Parent type)
    bounds = ...;

    yield to subclass init; // Or something to mark the Phase 1 - Phase 2 transition point

    // Phase 2 begins
    // method calls now available on fully initialized object (with Child type)
  }
}

The main drawback I see of Strategy One is that the control flow is pretty weird (and not so apparent from the code written in the initializer). What we get out of this weird control flow is that Phase 2 of the initializers can use temporaries from Phase 1 or arguments from the initializer.

What control flow am I talking about? This:
Parent Phase 1
Child Phase 1
Parent Phase 2
Child Phase 2

Strategy Two : Two-block

Lydia already pointed out that the two-block variant of initializer syntax might be better for this kind of thing:

I've also debated whether this implies that we should have call to Phase 1 of the parent type be separable from Phase 2 of the parent type, and from the division between the phases itself. When we discussed initializers, the alternate proposed syntax would permit this easily:

  proc init() {
    super.init();
    ... // Phase 1 code
  } finalize {
    ... // Phase 2 code
     // [Lydia had a super.init() here that I've removed]
  }

This way, either strategy (parent fields before child fields, parent fields after child fields) would be equally supported without making it dependent on compile time versus execution time fields, meaning that writing the program will be more comprehensible (though with the negative aspects mentioned in our discussion of the syntax choice).

For my running example, it would look like this:

class Parent {
  param rank: int;
  var x:int;
  proc init() {
    // we could insist on a super.init() call at this point

    // Phase 1 begins
    // method calls not available (or operate only with object type)
    x = 1;
  } finalize {
    // Phase 2 begins
    // method calls now available on fully initialized object (with Child type)
  }
}

class Child: Parent {
  var bounds: rank*int;

  proc init(param rankArg: int) {
    super.init(rankArg);

    // Phase 1 begins
    // method calls not available (or operate only with Parent type)
    bounds = ...;
  } finalize {
    // Phase 2 begins
    // method calls now available on fully initialized object (with Child type)
  }
}

The two-block syntax would remove the ability to have temporaries across Phase 1 and Phase 2, but it would enable arguments across Phase 1 and Phase 2.

This would implement the ordering I described above like this:
Parent Phase 1 aka init
Child Phase 1 aka init
Parent Phase 2 aka finalize
Child Phase 2 aka finalize

Strategy Three : separate finalize method

Actually, the compiler currently supports a separate zero-arguments proc initialize() as part of the old-style constructors. Note though that in the current compiler, if both proc initialize() and the constructor are provided, proc initialize() runs before the constructor.

Brad has occasionally argued that he's found this initialize method useful. One way we could (arguably) keep it is to decide that it is the way to implement Phase 2.

So, the idea is that proc init would only ever implement Phase 1 and that Phase 2 would be implemented in a proc finalize.

class Parent {
  param rank: int;
  var x:int;
  proc init() {
    // we could insist on a super.init() call at this point
    // this function is implementing Phase 1
    // method calls not available (or operate only with object type)
    x = 1;
    // Phase 2 is not available in any 'proc init'
  }

  proc finalize() {
    super.finalize(); // May want to insist this is present somewhere in finalize()
    // this function is implementing Phase 2
    // method calls now available on fully initialized object (with Child type)
  }
}

class Child: Parent {
  var bounds: rank*int;

  proc init(param rankArg: int) {
    super.init(rankArg);

    // Phase 1 begins
    // method calls not available (or operate only with Parent type)
    bounds = ...;
    // Phase 2 is not available in any 'proc init'
  }

 proc finalize() {
    // this function is implementing Phase 2
    // method calls now available on fully initialized object (with Child type)
    super.finalize(); // May want to insist this is present somewhere in finalize()
  }
}

This version has the advantage that Chapel programmers need less special knowledge about initializers. The main details to know are that the compiler adds calls to proc init and proc finalize to implement object construction. But the bodies of these functions themselves don't have any special control flow rules.

It has the disadvantage that any information that proc finalize needs from the proc init arguments has to be encoded into the class instance itself somehow - the arguments are no longer available. But - that's arguably also an advantage, in that it's obvious how to write code that is run no matter which proc init was called - you put that in proc finalize.

Interestingly, depending on where the super.finalize() call appears in Child.finalize, it can implement either of these orders:

Parent init (Phase 1)
Child init (Phase 1)
Parent finalize (Phase 2)
Child finalize (Phase 2)

Parent init (Phase 1)
Child init (Phase 1)
Child finalize (Phase 2)
Parent finalize (Phase 2)

(But of course we could decide to insist that super.finalize be always at the start or end of a proc finalize).

bradcray · 2017-12-19T04:51:22Z

I haven't caught up with Michael's weekend musings other than to understand that by asking about virtual dispatch he was asking for a parent's initializer to be able to call into a child's method (whereas the dynamic dispatch I was talking about would only permit calls into ancestor methods within an initializer, or within one's own methods in phase 2, never a child's). I wanted to point out that having something like the current initialize() hook (or postInit() as I've been thinking of it in the post-constructor world) would provide that support while also permitting users to continue leveraging the default initializer as advocated for elsewhere.

cassella · 2017-12-19T14:48:35Z

The first two points of @bradcray's counterproposal put me in mind of
C++. If you move the optional super.init() call earlier, before
the opening { of the initializer, even moreso. (I presume the
proposal would allow for a sibling this.init() call there
instead?) And the effective type of the object changing as it
progresses is also C++esque.

Before the { is also where they put field initialization,
particularly for const and ref fields. Though they don't have the
ability to use loops and local variables to compute those values.

C++'s approach to the object's type changing through its construction
is that in the body of a class C constructor, the object is a
C. It's on the writer of the constructor to call only methods
that can cope with the object in whatever partially-constructed state
it may be in.

Unrelated to anything thus far, I was wondering if there'd be any
mileage in giving explicit initialization its own syntax, e.g.
x := 27? Then in cases where initialization vs. assignment is
important, the programmer can specify them explicitly without having
to manage phase1 or phase2. And in cases where it's not important,
the programmer doesn't need to think about it.

(I'm imagining/hoping that the compiler would be free to substitute an
initialization if the first use of a field is written as an
assignment. Subject to consideration of side effects, etc. Then the
:= syntax might only be a way to assert that initialization would
be happening anyway.)

lydia-duncan · 2017-12-19T21:40:15Z

I suspect Brad's position on shifting to only allowing the initialization of parent fields first is more a simplification than a strong objection to allowing both strategies, but just in case I wanted to reiterate that I think allowing both orders and letting the user pick is the right call. I worry that switching from one strategy to the other will just lead to churn again later down the road, and see supporting both as a way to avoid that potential for churn (and as friendlier to the user).

I'm concerned that altering virtual/dynamic dispatch throughout the course of the initializer would be confusing for users, potentially dangerous, and difficult to implement correctly. I feel similarly worried about allowing method calls during Phase 1, though I recognize that we have some code that seems to desire it and that the virtual/dynamic dispatch proposal is an attempt to allow it. I'm not sure I have a good alternate proposal (and there is a part of me that wonders if the desire for this feature is the old constructors implementation hurting our forward progress rather than in keeping with our stated goal of following a more principled approach).

I am intrigued by Michael's Option 3 proposal, but would likely need to muse/discuss it more before feeling confident in choosing between it and Option 2. Michael asked a very good driving question in conversation today, which seemed important to capture: "Do we tend to want the same Phase 2 for all initializers on a type, or do we tend to want a different Phase 2 per initializer?"

I think I'm otherwise in agreement with what has been discussed so far.

lydia-duncan · 2017-12-19T21:40:20Z

@cassella - apologies, but I would prefer to keep this thread on initializers and inheritance, rather than initializers in general. We did consider an alternate syntax for initialization in our original discussions, but chose to forgo it. If you feel strongly that this should be revisited now, would you mind opening another issue?

cassella · 2017-12-20T03:40:35Z

Sorry about that. I don't feel strongly about it. I can delete my comments here if it would help keep this issue focused.

bradcray · 2017-12-20T06:42:59Z

I now have caught up on this thread and was excited to see that Michael's response also referred to using an initialize() replacement to capture doing calls from parent object creation to child method calls. I also like that it obviates the need for a fieldsComplete() marker as in my most recent proposal (if I'm understanding it correctly and we're not losing anything. As far as I can tell, any code that I would've put after fieldsComplete() in that proposal could now come at the start of Michael's finalize() routine.

I'm not crazy about the name finalize() but I think I like the concept.

Responding to some of Lydia's comments:

I suspect Brad's position on shifting to only allowing the initialization of parent fields first is more a simplification than a strong objection to allowing both strategies, but just in case I wanted to reiterate that I think allowing both orders and letting the user pick is the right call.

I think calling it a simplification is correct, but I might object to supporting both strategies in the name of simplicity. I feel pretty confident that the module code I've converted (and am working on converting) does not need the Swift-style "init child fields first" approach and also find it counterintuitive (since I think of child classes as specializing parent classes, it seems only natural that they would establish their unique aspects second). So I think switching between "whose fields are initialized first?" might be overkill. That said, one way to get it might be to permit the super.init() call to appear anywhere within an init() routine in Michael's proposal (whereas mine required it to be the first line). Since, in Michael's proposal, init() no longer has a phase 1 and phase 2, this means it could be placed wherever without needing to separate the phases. That said, is the ability to initialize a child's fields before a parent's considered a strength of Swift's, or was it just a tactic used in order to get the phase 1 vs. 2 semantics and dynamic dispatch from parent initializers to children?

I worry that switching from one strategy to the other will just lead to churn again later down the road, and see supporting both as a way to avoid that potential for churn (and as friendlier to the user).

I don't for the reason I alluded to above: It seems hard for me to imagine a case in which an initializer author would need to require that a child's fields were initialized before a parent's.

I'm concerned that altering virtual/dynamic dispatch throughout the course of the initializer would be confusing for users, potentially dangerous, and difficult to implement correctly.

I definitely don't agree with the latter two. The implementation seems trivial (during the phase 1 to phase 2 transition for a class, set its CID for that initializer's class). I don't think it's dangerous (the object is a valid instance of that class at that point. I concede that it may be confusing, but frankly, don't think it would be all that confusing or surprising to someone who was already creating a class hierarchy with inherited initializers...

Michael asked a very good driving question in conversation today, which seemed important to capture: "Do we tend to want the same Phase 2 for all initializers on a type, or do we tend to want a different Phase 2 per initializer?"

That's an interesting question... I think I have been writing different phase 2 code for different initializers on a given type, but I think it's typically been due to restrictions as to what can be expressed in phase 1 (or maybe philosophical thoughts about what I think should be in phase 1?). If this turns out to be a problem (which seems... likely), I think my preference would be to go with my previous proposal plus a postInit / finalize concept like I alluded to last night / Michael did in his option 3 for the sake of getting child-class dispatch plus the ability to leverage the compiler-provided initializer.

I should also add that I'm curious whether anyone has a more compelling / realistically-oriented example of a parent class wanting to call a child method during initialization than the simple one above?

mppf · 2017-12-20T16:15:19Z

Re this question:

"Do we tend to want the same Phase 2 for all initializers on a type, or do we tend to want a different Phase 2 per initializer?"

I don't know the answer to the question, but we could work with either answer in Strategy Two or in an adjusted Strategy Three.

If the answer is "tend to want the same", Strategy Three (the separate finalize() method) naturally does that. Strategy Two can do it as well but you'd write a new method e.g. setup() and call it from each finalize block.

If the answer is "tend to want different", Strategy Two does it naturally with the finalize block per initializer, but Strategy Three can be adapted to handle it as well. If we wanted Strategy Three to support such an idea, we might make a rule about which finalize(...) method is called in the event there are several with different signatures - e.g. we try to resolve first one with the same arguments that init had, and if that didn't work, try the no-arguments version. (Or we could even consider always insisting that a finalize(...) be available with the same argument signature as the init).

In any event I don't think we need 3 phases.

mppf · 2017-12-21T17:27:51Z

Just a note, C# calls what we'd call a deinitializer a finalizer...

bradcray · 2018-01-11T01:01:52Z

Earlier in this issue, Lydia asked whether a child initializer should be able to refer to parent const/var fields in phase 1 of its initializer and I essentially said "I don't care about that here/now." But having stewed on it a bit longer, I think I actually do. Here's a motivating example:

class C {
  var D = {1..n};
  var A: [D] real;
}

class C2 : C {
  var B: [D] string;
}

It feels very natural to me that a subclass should be able to declare arrays over a parent class's domain (or a domain in terms of the parent class's integer field or ...) and yet I believe that, at present, C2's phase 1 initializer wouldn't be able to establish fields like 'B' because C had not been initialized yet.

The proposal that Mike and I are working on ought to address this I believe. Just wanted to correct myself and note that it isn't simply the type/param/compile-time fields that seem likely to run afoul of our current "initialize child phase 1" first approach... other dependent types/values seem like they might as well.

bradcray · 2018-01-11T01:16:18Z

Completely unrelated to my previous note: The Collections modules (DistributedBag, DistributedQueue) also run into the original challenge of having subclasses who want to refer to a parent class's type field in their field declarations. Their hierarchy is also much simpler than domain maps, and so would be a better case to study for any new proposal.

mppf · 2018-01-11T16:39:37Z

@bradcray - it's possible to construct an example like your latest C2/C that uses a runtime type instead, which makes it even clearer to me that we need a solution to the ordering issue that works for both type and value fields. Let me know if you'd me to create a full example along these lines.

bradcray · 2018-01-18T20:48:12Z

Let me know if you'd me to create a full example along these lines.

I feel like we've got plenty of examples to motivate an "initialize parent first" approach, but if you want to supply an additional case, I'll throw it into the mix. I'm not guessing what you're alluding to (and would've thought that C2.B above did have a runtime type).

mppf · 2018-01-19T14:17:55Z

@bradcray - here's the example I'm thinking of:

class ComputationState {
   var n:int;
   var StateArray:[1..n] real;
}
class ExtendedComputationState : ComputationState {
  var ExtendedStateArray:StateArray.type;
}

(Of course we could construct examples where the runtime type is stored in a type field, too

class ComputationState {
   type StateArrayType;
   var StateArray:StateArrayType;
}
class ExtendedComputationState : ComputationState {
  var ExtendedStateArray:StateArrayType;
}

).

Here not only are there variables with runtime types, but the runtime type of the parent class field is used to initialize the child class field. I think this is strong evidence that we need initialize parent fields first (at least as long as such patterns with runtime types are possible).

bradcray · 2018-01-23T23:18:30Z

Closing this issue, as I believe it has now been superseded by #8283.

bradcray added area: Language type: Design area: Compiler labels Dec 13, 2017

bradcray mentioned this issue Jan 5, 2018

Should copy initializers be generated by default, even when other explicit initializers are present? #8065

Closed

bradcray mentioned this issue Jan 23, 2018

Agree on core of "new initializers" approach #8283

Closed

2 tasks

bradcray closed this as completed Jan 23, 2018

Support child class fields that reference parent class param/type fields #8018

Support child class fields that reference parent class param/type fields #8018

Comments

bradcray commented Dec 13, 2017

bradcray commented Dec 13, 2017

lydia-duncan commented Dec 13, 2017

lydia-duncan commented Dec 13, 2017

bradcray commented Dec 13, 2017

bradcray commented Dec 13, 2017

lydia-duncan commented Dec 13, 2017

lydia-duncan commented Dec 13, 2017

bradcray commented Dec 13, 2017

lydia-duncan commented Dec 13, 2017 • edited Loading

noakesmichael commented Dec 13, 2017 via email • edited by bradcray Loading

bradcray commented Dec 13, 2017

noakesmichael commented Dec 13, 2017 via email • edited by bradcray Loading

lydia-duncan commented Dec 13, 2017

bradcray commented Dec 14, 2017

noakesmichael commented Dec 14, 2017 via email • edited by bradcray Loading

bradcray commented Dec 14, 2017

noakesmichael commented Dec 14, 2017 via email

bradcray commented Dec 14, 2017

lydia-duncan commented Dec 14, 2017

lydia-duncan commented Dec 14, 2017

lydia-duncan commented Dec 14, 2017

bradcray commented Dec 15, 2017 • edited Loading

nspark commented Dec 15, 2017 • edited Loading

bradcray commented Dec 15, 2017

mppf commented Dec 15, 2017

bradcray commented Dec 15, 2017

mppf commented Dec 15, 2017

mppf commented Dec 15, 2017 • edited Loading

bradcray commented Dec 16, 2017

mppf commented Dec 16, 2017 • edited Loading

mppf commented Dec 16, 2017

mppf commented Dec 16, 2017 • edited Loading

Strategy One : weird control flow

Strategy Two : Two-block

Strategy Three : separate finalize method

bradcray commented Dec 19, 2017

cassella commented Dec 19, 2017

lydia-duncan commented Dec 19, 2017 • edited Loading

lydia-duncan commented Dec 19, 2017

cassella commented Dec 20, 2017

bradcray commented Dec 20, 2017

mppf commented Dec 20, 2017 • edited Loading

mppf commented Dec 21, 2017

bradcray commented Jan 11, 2018

bradcray commented Jan 11, 2018

mppf commented Jan 11, 2018

bradcray commented Jan 18, 2018 • edited Loading

mppf commented Jan 19, 2018 • edited Loading

bradcray commented Jan 23, 2018

lydia-duncan commented Dec 13, 2017 •

edited

Loading

noakesmichael commented Dec 13, 2017 via email •

edited by bradcray

Loading

noakesmichael commented Dec 13, 2017 via email •

edited by bradcray

Loading

noakesmichael commented Dec 14, 2017 via email •

edited by bradcray

Loading

bradcray commented Dec 15, 2017 •

edited

Loading

nspark commented Dec 15, 2017 •

edited

Loading

mppf commented Dec 15, 2017 •

edited

Loading

mppf commented Dec 16, 2017 •

edited

Loading

mppf commented Dec 16, 2017 •

edited

Loading

lydia-duncan commented Dec 19, 2017 •

edited

Loading

mppf commented Dec 20, 2017 •

edited

Loading

bradcray commented Jan 18, 2018 •

edited

Loading

mppf commented Jan 19, 2018 •

edited

Loading