How do methods only from the type definition point interact with shadowing? #19352

mppf · 2022-03-04T14:46:45Z

For https://github.com/Cray/chapel-private/issues/2690

This issue is asking a question about how the rule to find visible functions from the method's definition point interacts with shadowing. #16415 summarizes the current design which was implemented in PR #16208. PR #16976 implemented a related feature where use SomeModule.someType or import SomeModule.someType will bring in methods defined on someType from that module.

For the most part, the current behavior seems to be similar to the type's definition point being a last place to look if nothing else is found. In other words, methods available locally or via a use statement will always shadow methods available only from the type's definition point. I do not view the current situation as an obvious combination of other ideas but it is something we could keep and then document.

The remainder of this issue discusses a few specific examples. In these examples, we will assume this is the start of the program:

module Library {
  record rec {
    proc method() { writeln("Library's rec.method()"); }
  }
}


module LibraryPlus {
  import Library;
  import Library.rec;

  proc rec.method() { writeln("LibraryPlus's rec.method()"); }
}

Case 1: Can Import of a Type Cause Shadowing?

First, even after PR #19306, the import SomeModule.someType does not seem to shadow at all. Arguably it should, but maybe an ambiguity error is bettor for hijacking scenarios.

// Case 1
module Program {
  import Library;
  proc main() {
    import LibraryPlus.rec; // does this make LibraryPlus.rec.method "closer" ?
    var r = new Library.rec();
    r.method(); // currently: ambiguity error
  }
}

Case 2: Should Type's Definition Point be Shadowed?

How should methods brought in from the type's definition point interact with a method defined locally? Today the local method shadows the ones from the definition point.

// Case 2a
module Program {
  import Library;
  import Library.rec; // Should this bring in the methods as if defined here?

  proc rec.method() { writeln("Program's rec.method()"); }

  proc main() {
    var r = new Library.rec();
    r.method(); // currently outputs: Program's rec.method()
  }
}

// Case 2b
module Program {
  import Library; // Should this bring in the methods
                  // on types defined in Library as if defined here?

  proc (Library.rec).method() { writeln("Program's rec.method()"); }

  proc main() {
    var r = new Library.rec();
    r.method(); // currently outputs: Program's rec.method()
  }
}

Case 3: Should a Use Statement Cause Shadowing?

Should a method available via a use statement always shadow methods available only from the type's definition point? Arguably, it would be better for this to be an ambiguity error, to prevent a hijacking scenario.

module Program {
  public import Library;
  public use LibraryPlus; // should LibraryPlus.rec.method now shadow Library.rec.method?

  proc main() { 
    var r = new Library.rec();
    r.method(); // currently:
                //  without PR #19306, ambiguity error
                //  with PR #19306, outputs LibraryPlus's rec.method()
  }
}

Interaction with Adding a Method / Breaking vs Non-Breaking and Hijacking

If Library didn't have method but Program did have one, then when Library adds one it will be shadowed by the Program version (non-breaking). That seems OK.

However, when Program depends on both Library and LibraryPlus and these add method in one order or another, I don't see how it is possible to keep that method addition as a non-breaking change to Program.

If Library didn't have method but LibraryPlus added it, then when Library adds one it will currently be an ambiguity error (Case 1) (breaking) or will be shadowed by the LibraryPlus version (Case 3) (non-breaking). If we changed it to be shadowed by the Library version, it would be worse (breaking / hijacking).

If Library has method but LibraryPlus didn't, and then LibraryPlus adds one, it will currently be an ambiguity error (Case 1) (breaking) or it will will be shadowed by the LibraryPlus version (Case 3) (breaking / hijacking).

So, arguably, it would be better for Case 1 and Case 3 to be ambiguity errors, since if we prefer one or the other for shadowing, that introduces the possibility of surprising behavior changes when a method is added. At least the ambiguity errors would point out that there is a potential problem.

However, making these into ambiguity errors seems to make the way use/import find methods for a type very different from how they would find other symbols. I think the result would be counter-intuitive.

The text was updated successfully, but these errors were encountered:

mppf · 2022-03-04T17:10:53Z

Relevant spec sections are https://chapel-lang.org/docs/latest/language/spec/methods.html#method-calls and https://chapel-lang.org/docs/language/spec/procedures.html#determining-visible-functions . The determining visible functions section should be updated to point to the other (or have a sentence). Neither say anything about shadowing.

lydia-duncan · 2022-03-04T22:03:59Z

For the most part, the current behavior seems to be similar to the type's definition point being a last place to look if nothing else is found.

I'm not sure I agree with that, but it might be that I did something which caused it. I think if a type defines a primary or secondary method, that method should always be found (discussions of privacy aside). That doesn't say anything about its precedence relative to other symbols.

Personally, I think things provided by the definition point of the type should take precedence when a clear equivalence is known. Doing otherwise would enable hijacking and go against the intentions of the creator of the type. The one place that is somewhat questionable to me is references within a method on the type that could correspond to a method name or could correspond to a function name, e.g.

proc symbolB() { ... }

proc Foo.methodA(...) {
  symbolB(); // Foo has a method named symbolB, which should be chosen?
  // The user should be able to write something to make a choice, though, e.g. `Module.symbolB` or `this.symbolB`
}

First, even after PR #19306, the import SomeModule.someType does not seem to shadow at all. Arguably it should

I don't think it should - tertiary methods should be defined to be complementary to the primary and secondary methods, enabling them to substitute primary and secondary methods leads to hijacking.

I think we should consider shadowing with regards to the case of methods vs functions choices.

How should methods brought in from the type's definition point interact with a method defined locally? Today the local method shadows the ones from the definition point.

This does seem right to me, though. I think the writer generally knows what they are doing, so if they want to replace that method for their own use, that seems reasonable, even though it may go against what the author of the module with the type's definition wants.

However, when Program depends on both Library and LibraryPlus and these add method in one order or another, I don't see how it is possible to keep that method addition as a non-breaking change to Program.

I think that's okay. Program has taken on the risk by choosing to extend Library. If their functionality gets incorporated back into the main library, they probably want to know about it so they don't have to maintain their own copy any more.

If Library didn't have method but LibraryPlus added it, then when Library adds one it will currently be an ambiguity error (Case 1) (breaking) or will be shadowed by the LibraryPlus version (Case 3) (non-breaking). If we changed it to be shadowed by the Library version, it would be worse (breaking / hijacking).

But Library is the owner of the type. If they want to change it, that should take precedent over extensions. Future users of Library will start from the new contents as their base point - if the maintainer of LibraryPlus wants to provide something different, they should move to distinguishing it more and users should know they have to evaluate which version is a better fit for their program.

bradcray · 2022-03-05T00:44:50Z

In reviewing #19306 and particularly test/classes/diten/breakList.chpl, I found myself surprised that a tertiary method would be considered "closer" than a primary method (or vice-versa, though it didn't come up there), regardless of the use/import chains that got us there. As stated there, I think the reason is that I consider methods to be primarily associated with the this type on which they're invoked rather than by any chain of use/imports. Rather, I think of use/import as providing the set of primary/secondary/tertiary methods that are visible, but then I tend to think of them as all being equally visible for a given myExpr.method() call. And conversely, I think of the scoping in use/import as primarily involving standalone routines and variables.

On #19306, Michael suggested that we rely on this for compatibility functions, but that doesn't feel super-familiar to me in the method context, more in the standalone case. And even if we did rely on it for compatibility functions, it seems we could label the compatibility case as "last resort" so that it wouldn't trigger the ambiguity if the type provided the method directly?

I realize this is "new" and not something we've discussed before (that I can recall). I think it reflects (a) new viewpoints on my part after the recent primary/secondary/tertiary methods visibility decisions (i.e., the decision that they are visible through their types, not names) and (b) being surprised by that breakList.chpl case's behavior on #19306.

mppf · 2022-03-07T15:08:43Z

Responding to @bradcray -

Michael suggested that we rely on this for compatibility functions, but that doesn't feel super-familiar to me in the method context, more in the standalone case

Right I was not trying to say anything specific about methods specifically - just to say that we seem to have a pattern with "use statement within a function to control shadowing" (for non-methods).

I found myself surprised that a tertiary method would be considered "closer" than a primary method (or vice-versa, though it didn't come up there),

I can get behind an ambiguity error for Case 1 and Case 3 (for the philosophical reasons you describe above and also because of the hijacking scenarios I outlined). It sounds like your viewpoint is that Case 2 should also be an ambiguity error. Am I understanding correctly?

--- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

bradcray · 2022-03-09T00:32:57Z

Finally found my way back here...

It sounds like your viewpoint is that Case 2 should also be an ambiguity error. Am I understanding correctly?

Case 2 is interesting. I don't currently feel strongly about it. If we decided that local scope methods shadowed other methods brought in by use/import, I'd be OK with that I think. OTOH, if we made it an ambiguity, we could change our minds later without that being a breaking change.

mppf · 2022-03-17T20:53:08Z

In looking at an implementation of this in #19306 (comment) I am observing that simply making isMoreVisible return "neither is more visible" for methods means that the overloads sets error occurs in more cases. One thing is that the overloads sets error is described in terms of shadowing in https://chapel-lang.org/docs/technotes/overloadSets.html . An obvious approach is to have an idea that one method can be "more visible" than another method but just don't make use of this when disambiguating.

This is effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

mppf · 2022-03-17T21:17:03Z

Besides the above about overload sets, the only other thing I learned by implementing the strategy of completely ignoring method visibility is that we needed to change some operator methods into non-method operators to keep tests working as intended. (Because non-method operators can be "more visible").

mppf · 2022-03-17T21:19:59Z

(The above comment is interesting because operators can be both method-based and not method-based, so we have to define how one of each interacts for the purposes of shadowing).

--- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

This is effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

bradcray · 2022-03-17T22:52:50Z

the overloads sets error occurs in more cases.

I think you and I have expressed before that we've been surprised that we didn't trigger overload sets more often since enabling them. Are these new cases ones that feel like "Oh, yeah, of course that should be an overload set error, so this is better", or more like cases that feel unfortunate or like a step backwards in some way?

we needed to change some operator methods into non-method operators to keep tests working as intended

Are these tests that were specifically written to see how the shadowing behaves for operators, or are they cases that resemble real user codes and would want/need to make this sort of change to continue working?

mppf · 2022-03-17T23:15:42Z

the overloads sets error occurs in more cases.

I think you and I have expressed before that we've been surprised that we didn't trigger overload sets more often since enabling them. Are these new cases ones that feel like "Oh, yeah, of course that should be an overload set error, so this is better", or more like cases that feel unfortunate or like a step backwards in some way?

I'm not really sure. https://github.com/chapel-lang/chapel/blob/main/test/functions/operatorOverloads/operatorMethods/allOps/ioOperator.chpl is an example that would give an overload set error if the overload sets don't have an idea of a "closer" method. But that specific case could be written as a non-method, too.

we needed to change some operator methods into non-method operators to keep tests working as intended

Are these tests that were specifically written to see how the shadowing behaves for operators, or are they cases that resemble real user codes and would want/need to make this sort of change to continue working?

Usually they are tests of operators. An example is df1fe36.

mppf · 2022-03-25T20:00:48Z

Re the breakList test (see #19352 (comment) ) and the adjustment discussed above to make disambiguation ignore method shadowing but overload set checking be aware of "closer" (#19352 (comment)) -- I am finding that the breakList tests still prints out I Got you!.

Why?

First, disambiguation prefers the new overload proc LinkedList.append(x: int) because it is more specific than proc LinkedList.append(x: real) (since we are working with a new LinkedList(real);).

Second, overload checking correctly considers the new overload "more visible" - this pattern is summed up here:

  use LinkedLists;
  proc bar() {
    use LibGL; // defines a tertiary method LinkedList.append
    var li = new LinkedList(real);
    li.append(3); // uses LibGL.LinkedList.append because the 'use' had closer scope
  }

As a result there is no overload sets error.

I am currently thinking that

disambiguation not considering "more visible" for methods and
adjusting the breakList's .good file to include I got you and no overload set error

is OK for PR #19306.

However, we have to make some decision - if "more visible" is ignored during disambiguation for methods, should it also be ignored for overload set checking? (and if it is, how can we fix functions/operatorOverloads/operatorMethods/allOps/ioOperator.chpl ? is it OK to use a non-method operator or different types in that test?)

lydia-duncan · 2022-03-25T22:24:41Z

First, disambiguation prefers the new overload proc LinkedList.append(x: int) because it is more specific than proc LinkedList.append(x: real) (since we are working with a new LinkedList(real);).

Is this a typo? It seems like x: real would be more specific than x: int since the LinkedList is over real, but it looks like the overload is actually LinkedList.append(x: eltType) (which I think counts as generic, is that what you mean?)

lydia-duncan · 2022-03-25T22:28:49Z

I don't think this changes my position as stated in #19352 (comment) or the one I gave when talking about operators:

Michael: If there are both operator methods and operator functions, is there a preference between the two?
Me: I lean towards they should conflict rather than have a preference, since they are sort of intended to be interchangeable (except that operator methods are treated like primary and second methods, so can be found in places where a standalone would not be accessible)

mppf · 2022-03-28T14:06:59Z

First, disambiguation prefers the new overload proc LinkedList.append(x: int) because it is more specific than proc LinkedList.append(x: real) (since we are working with a new LinkedList(real);).

Is this a typo? It seems like x: real would be more specific than x: int since the LinkedList is over real, but it looks like the overload is actually LinkedList.append(x: eltType) (which I think counts as generic, is that what you mean?)

It's not a typo. The LinkedList has LinkedList.append(x: eltType) where eltType is real. The x argument in this function is not generic. (You can think of it as proc append(this: LinkedList, x: this.eltType) if that helps at all; the type of x is concrete but it is computed based on other arguments. This function is generic because the this receiver is generic; but that fact is irrelevant for this particular test).

So, the LinkedList library is providing LinkedList.append(x: real), effectively, since eltType is real in this case.

Meanwhile,

  use LinkedLists;
  proc bar() {
    use LibGL; // defines a tertiary method proc LinkedList.append(x: int)
    var li = new LinkedList(real);
    li.append(3); // uses LibGL.LinkedList.append because the 'use' had closer scope
  }

the LibGL tertiary method declares append with a formal x of type int
the call to append passes 3 which is an int
the usual LinkedList.append requires a conversion from 3:int to real so is less specific that passing 3 to an int formal.

mppf · 2022-03-28T14:15:00Z

@lydia-duncan

I don't think this changes my position as stated in #19352 (comment) or the one I gave when talking about operators:
...

Are you saying that, from the OP, Case 1 and Case 3 should be ambiguity errors but Case 2 should be allowed (and prefer the tertiary method)? Or something else?

From the comment you linked -

Personally, I think things provided by the definition point of the type should take precedence when a clear equivalence is known.

I do not know what this means. Are you trying to make a new proposal? Something other than "ambiguity" in Case 1 and Case 3 and the current behavior in Case 2? If you are making a new proposal, could you write an example to show what it is you are proposing?

--- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

This is effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

effectively Case 2 in chapel-lang#19352. --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

More to do with deciding about overload set checking and method visibility, but this behavior makes sense with what is implemented now. See chapel-lang#19352 (comment) --- Signed-off-by: Michael Ferguson <mppf@users.noreply.github.com>

e-kayrakli · 2022-03-28T18:34:49Z

Offline Michael and I have discussed a bit about Case 2, which I think should be prohibited by issuing an ambiguity error.

A specific case where I have exploited we don't do that before was to redefine array accessors using tertiary methods to log different kinds of array accesses. I did that in the context of a unit test.

However, I think that's too much power for the user to have. And I am not convinced that this is a useful feature. Can this be used to change the behavior of an existing source by adding an innocent-looking module, for example? I think that can be the case, and that's a scary prospect.

The following gives an ambiguity error for example: (The compiler says "ambiguous call" which is good enough. But it puts the blame on the call site, listing two candidates. I think the blame should be on the definition of the secondary method.)

record R {
  proc foo() {
    writeln("primary");
  }
}

proc R.foo() {
  writeln("secondary");
}

var r = new R();

r.foo();

where a secondary method shadows a primary. I think this behavior is correct and similarly, we shouldn't allow tertiary methods to shadow primary/secondary ones.

lydia-duncan · 2022-03-28T18:38:07Z

It's not a typo. The LinkedList has LinkedList.append(x: eltType) where eltType is real. The x argument in this function is not generic. (You can think of it as proc append(this: LinkedList, x: this.eltType) if that helps at all; the type of x is concrete but it is computed based on other arguments. This function is generic because the this receiver is generic; but that fact is irrelevant for this particular test).

Ah, of course, thank you! Yeah, that makes sense.

Are you saying that, from the OP, Case 1 and Case 3 should be ambiguity errors but Case 2 should be allowed (and prefer the tertiary method)?

I'm torn - I think the writers of a tertiary method should be informed when that method is not usable because there is a primary/secondary method. However, I don't want a side module to break interactions with the main module where the type is defined because that seems like hijacking. At least, though, in the case of an ambiguity the user of both will know that they need to be careful what they bring in.

mppf · 2022-03-28T18:57:10Z

Well, at least making it an ambiguity should enable future language changes that make it compile and run, right? Meaning, making it an ambiguity is the design we should prefer if we have some uncertainty here?