Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear behavior when use statement brings in a module with the same name #11262

Closed
lydia-duncan opened this issue Oct 1, 2018 · 20 comments · Fixed by #13997
Closed

Unclear behavior when use statement brings in a module with the same name #11262

lydia-duncan opened this issue Oct 1, 2018 · 20 comments · Fixed by #13997

Comments

@lydia-duncan
Copy link
Member

Summary of Problem

When a nested module shared a name with a top-level module (or a module that is already in scope) and the nested module is brought into scope via a use statement, the contents of the nested module cannot be accessed via the module prefix, because the top-level module is searched instead. This seems confusing, so I think one of three things should happen here:

  1. The error message (or at least a warning) should reference the module prefix instead of the symbol we are trying to find within it.
  2. The error message (or at least a warning) should be on the use statement which brings in the module, saying that the module's symbols will not accessible via the module prefix due to a conflict with the module which is already in scope.
  3. Alternatively, the nested module should also be searched for the symbol instead of only the previously accessible module.

I've included a test where the module is defined within a module sharing the name at the top level and the use statement explicitly mentioned the module with the duplicate name. This situation will likely also occur when

  1. Both modules are brought into scope via a use statement
  2. Either of the two modules is brought into scope by an unlimited use statement (i.e. one without an except or only list)

Steps to Reproduce

Source Code:

module M {
  module M {
    proc whatev() {
      writeln("whee");
    }
  }
}

use M only M;
M.whatev();

Compile command:
chpl innerModuleSharesName-topLevel.chpl

Associated Future Test(s):
test/visibility/only/innerModuleSharesName-topLevel.chpl #11226

Configuration Information

  • Output of chpl --version: chapel 1.19.0 (pre-release)
@bradcray
Copy link
Member

bradcray commented Oct 1, 2018

I don't think we should take approach 3 of having M.whatev() also result in searching the nested module. We should resolve the M to a single module first and only look in that module or fail to resolve M at all.

If we did stick with resolving to the top-level module (as we do today), I could imagine adding module line numbers to the error message might help, but am not sure this is likely to come up often enough to be worth the effort.

But I think it might be more consistent in your example if we were to issue an error like the ones we do for the following programs:

Program 1: Two uses result in the same symbol being available twice:

module M {
  var a: int = 42;

  module M {
    var a: int = 23;
  }
}

use M;
use M.M;
writeln(a);
testit.chpl:2: error: symbol a is multiply defined
testit.chpl:5: note: also defined here

Program two: Two modules at the same scope have the same name

Or this one:

module M {
}

module M {
}
testit1b.chpl:1: error: 'M' has multiple definitions, redefined at:
  testit1b.chpl:4

I'm not sure whether these errors would come about as a result of the use M only M line or the subsequent attempt to resolve M (I'd guess the latter).

@lydia-duncan
Copy link
Member Author

Yeah, I would guess the latter as well.

@bradcray

This comment has been minimized.

@lydia-duncan

This comment has been minimized.

@BryantLam
Copy link

BryantLam commented Aug 30, 2019

Would it be crazy to change the behavior of use to only allow unqualified access? This proposed use behavior would only be useful the same time that import is implemented because it breaks idioms like use M only / use M except * since they become no-ops.

@lydia-duncan
Copy link
Member Author

I wouldn't call it crazy, but it might be annoying to have to type both an import and a use of a module if you want both behaviors. Note that Haskell will always allow both qualified and unqualified access when unqualified access is allowed (to rephrase, its import statement always enables at least qualified access)

@BryantLam
Copy link

BryantLam commented Aug 30, 2019

but it might be annoying to have to type both an import and a use of a module if you want both behaviors.

Certainly, though I would say that the intent becomes much more clear because you want two behaviors, you need to type two statements. I don't think it'd be common that users would want both behaviors. Anyway, that suggestion is for another issue but it would cleanly disambiguate this issue and #13925.

@mppf
Copy link
Member

mppf commented Aug 30, 2019

Does it work with

module M {
  module M {
    proc whatev() {
      writeln("whee");
    }
  }
}

use M only;
M.M.whatev();

?

@BryantLam
Copy link

BryantLam commented Aug 30, 2019

My (to be another issue) proposal idea is that use statements currently do unqualified + qualified access, but I would remove the qualified-access behavior. The code would have to be rewritten as:

use M only; // This would be a no-op; Make it an error.
            // Or treat as special case (but why not just `import` then).
use M;
M.whatev();
M.M.whatev(); // Illegal. outer `M` not available via `use`. Use `import`.

which is why this behavior wouldn't be useful unless import statements (for qualified access) were also provided.

@mppf
Copy link
Member

mppf commented Aug 30, 2019

@BryantLam - I think you are saying the same as what I proposed here: #13119 (comment) - is that right? AFAIK this doesn't have its own issue yet, but comes up in #13831.

@BryantLam
Copy link

BryantLam commented Aug 30, 2019

Current scope as well, but yes, the principle sounds the same. Whoops. Misread what you meant by "other modules". It is the same. 👍

@bradcray
Copy link
Member

bradcray commented Sep 4, 2019

I thought PR #13930 would improve the situation for this test, and it does change it (arguably slightly for the better), but it still doesn't work as you'd expected.

After some more thinking and talking with Lydia, I think the current behavior of the compiler is correct for this code. Showing the code again:

module M {
  module M {
    proc whatev() {
      writeln("whee");
    }
  }
}

use M only M;
M.whatev();

What threw me is that the use statement is immaterial here due to the fact that the M.whatev(); line is in the same local scope as the declaration of module M. Recall (as I failed to) that local variables always win over those made visible by a use statement (in order to prevent a use from hijacking a seemingly obvious reference to a local variable), and so far we haven't prevented code from referring directly to in-use submodules as issue #13536 discusses. So in resolving M.whatev();, the compiler resolves M to the topmost M module, not finding a whatev() symbol within it, and generating the expected error:

testit.chpl:11: error: Symbol 'whatev' undeclared in module 'M'

The reason I was expecting a different behavior is that I was mentally thinking of the program more like this:

module M {
  module M {
    proc whatev() {
      writeln("whee");
    }
  }
}

{
  use M only M;
  M.whatev();
}

(that is, the use is in some other code not directly connected to the scope where module M was declared). In this case, the M in M.whatev() would refer to M.M (symbols made available by a use statement win out over the name of the module being used itself), the call resolves, and "whee" is printed out.

So I think that this future could be retired given the current definition and implementation of the language. That said, the issue itself could remain open where perhaps it should be requesting an error message that refers not just to module M but also to where M was declared as suggested above:

If we did stick with resolving to the top-level module (as we do today), I could imagine adding module line numbers to the error message might help, but am not sure this is likely to come up often enough to be worth the effort.

But, as mentioned there, I'm not sure the effort is worth it for this specific case, and that when we encounter users who write code like this, we should have their friends and mentors slap them rather than complaining to us. (So I'd probably just close this issue as well).

@bradcray
Copy link
Member

bradcray commented Sep 4, 2019

Michael asked about this case:

module M {
  module M {
    proc whatev() {
      writeln("whee");
    }
  }
}

use M only;
M.M.whatev();

which works, but for similar reasons: the use doesn't really have any impact on the execution. Putting the use into its own scope still works (as expected though). I'm working on a PR that gathers together a bunch of the variations being discussed here to serve as indicators of current behavior and to be alerted if/when things change.

bradcray pushed a commit to bradcray/chapel that referenced this issue Sep 4, 2019
I believe innerModuleSharesName-topLevel.chpl is working as expected
so should be retired, which I do here.  This PR also adds a number
of variations on it that came up in discussions around issue chapel-lang#11262.
@bradcray
Copy link
Member

bradcray commented Sep 4, 2019

PR #13997 adds the aforementioned variants (and retires the future if nobody disagrees).

bradcray added a commit that referenced this issue Sep 5, 2019
Capture module visibility use case puzzle codes and retire the original

[reviewed by @lydia-duncan]

I believe innerModuleSharesName-topLevel.chpl is working as expected
so should be retired, which I do here. This PR also adds a number
of variations on it that came up in discussions around issue #11262.

Resolves #11262.
@BryantLam
Copy link

That said, the issue itself could remain open where perhaps it should be requesting an error message

I understand your explanation. What are the implications for having the compiler treat the original code as a multiple symbol error on M instead of allowing M to be shadowed?

@bradcray
Copy link
Member

bradcray commented Sep 5, 2019

What are the implications for having the compiler treat the original code as a multiple symbol error on M instead of allowing M to be shadowed?

Are you saying that the conflicting multiple symbols would be due to the first module M declaration conflicting with the M made available by use M only M? (as opposed to saying that the inner module M conflicts with the other, or maybe something else?) It seems like that would be pretty impactful in the sense that anytime you used a module that happened to have symbols that shared names with your local ones (perhaps ones you were ignorant of like Math.e), you'd get an error rather than having your local ones "win" over ones that you either weren't interested in or knew wouldn't be a problem because your local ones were more local.

I think part of the reason I have trouble worrying too much about this issue's specific case is that it seems pretty heavily related to the fact that the author of M created a sub-module with the same name. I.e., if we switched to different module names, I don't think there's any question that the right thing is happening:

module M {
  module N {
    proc whatev() {
      writeln("whee");
    }
  }
}

use M only N;
M.whatev();  // obviously an error; M doesn't define `whatev()`
N.whatev();  // obviously OK; N is visible and does
whatev();     // obviously an error; N hasn't been `use`d

As a result, it seems like we've spent a lot of effort discussing how to make poorly written code less confusing rather than focusing on problems that might come up in more realistic situations.

Or do you think that there are realistic patterns that are similar to the original example which suggest the current behavior is wrong? (for use as currently defined and ignoring possible variations that we're discussing on other issues).

@BryantLam

This comment has been minimized.

@BryantLam
Copy link

BryantLam commented Sep 5, 2019

I think the practical choice of using shadowing for symbol conflicts introduced via use statements is an unfortunate confluence of design decisions around:

  1. use statements having unqualified access
  2. not having import statements for only qualified access
  3. use statements being transitive until private use was added and having lived in the previous world for so long
  4. symbols of different kinds, especially module symbols, not being treated the same as each other
  5. no symbol renaming

I'd like my mental model to think of use statements as bringing a symbol into the current scope knowing that it cannot conflict. Shadowing does break that model when it comes to symbol conflicts, e.g., Math.e. #13118. Demonstrates (3).

For example, here are some current behaviors:

// 1.chpl
module M {
  proc fn() { writeln("M"); }
}

proc fn() { writeln("g"); }

{
  use M;
  fn();
}
9: error: ambiguous call 'fn()'
2: note: candidates are: fn()
5: note:                 fn()

Demonstrates (4). Strictly to your interpretation, why is this ambiguous? Shouldn't it only see M.fn?

// 2.chpl
module M {
  module N {
    proc fn() { writeln("M"); }
  }
}

module N {
  proc fn() { writeln("g"); }
}

{
  use N only;
  use M;
  N.fn();
}
7: error: symbol N is multiply defined
2: note: also defined here

This error makes sense because different N symbols are being brought into scope.

// 3.chpl
module M {
  var x = 100;
}

var x = 42;

{
  use M;
  writeln(x);
}

This prints "100" and is consistent with the shadowing rules.

// 4.chpl
module M {
  module N {
    var x = 100;
  }
}

module N {
  var x = 42;
}

{
  use N only; // It doesn't matter if first or second.
  use M;
  writeln(N.x); // Always prints 42.
}

This prints "42". Why does it not error as I would expect on the multiple-defined N?

(chapel/master Sep 2, 2019)

While some changes have hit master, if you run these examples on 1.19.0 in TIO, you get different answers for two of them.

Though I recognize without import statements, the shadowing behavior cannot be readily changed.

It's also reasonable to say that the two cases above that don't behave right are just bugs, but I think it points to the fact that shadowing is really hard to debug.

Edit: To clarify, codes 1 and 3 should shadow whereas codes 2 and 4 should not. The difference is the use statements in scope. Or maybe I'm just confused. :)

@bradcray
Copy link
Member

bradcray commented Sep 5, 2019

symbols of different kinds, especially module symbols, not being treated the same as each other

When first reading this, I was going to say that I thought Chapel had traditionally treated symbols very similarly to one another, including module symbols—and that #13930 represented one of the recent major ways in which we started treating module symbols more differently. Your example points out a case I was brushing past in that thought, though, which is that functions are treated differently in order to support overloading (i.e., it's legal to have multiple functions with the same name and to select between them based on the arguments provided; but it's not legal to have multiple variables, modules, etc. of the same name because those symbol kinds don't support overloading).

Looking at the specific example you provided:

// 1.chpl
module M {
  proc fn() { writeln("M"); }
}

proc fn() { writeln("g"); }

{
  use M;
  fn();
}

I don't feel confident that the current behavior is best / ideal if users wanted to push back against them. For example, we could potentially amend the current rules to permit functions to shadow one another if their signatures match (by some definition of match... these ones obviously do, but decisions would need to be made about argument names, default arguments, intents and the like. Ideally these would be similar to the rules used for override methods. I'm not enough of a language expert to know whether this is established practice in other languages or whether we'd be carving out new ground here). There's been a discussion related to this on the Arkouda project recently, though I don't feel confident that I should link to it here.

I'd like my mental model to think of use statements as bringing a symbol into the current scope knowing that it cannot conflict.

The traditional definition has been more like: "use statements make symbols available as though they were in a scope just outside of the current lexical scope but closer in that its parent scope (and the module name that was used is in a scope just above that)." To illustrate that, this code:

module M {
  var x = ...;
}
module N {
  var y = ...;
  {
    use M;
    var z = ...;
  }
}

would cause N to act something like this:

module N {
  var y = ...;
  {
    ...module M... is available from here
    {
       ...var [M.]x... is available from here
      {
        var z = ...;
      }
    }
  }
}

I think the historical rationale for this interpretation has been due to uses having the characteristics of (a) making everything rather than nothing available by default and (b) in a no-need-to-qualify manner (these choices reflecting that those of us designing the language had lived our lives in more of a "programming in the small" rather than "in the large" style). So this interpretation of use avoided the ugly case of having a use accidentally shadow something that visually was clearly in the local scope.

Cases 2 and 4 behave the same on master now (e.g., chpl version 1.20.0 pre-release (faa418f)) and resolve to the "M" / "42" cases). The reason for this is due to the interpretation of use just above. The use M and use N only; statements drop their contents into the scope just outside of the use (putting M.N and [nothing] into the scope respectively); and they drop their own names into the scope just outside of that (putting top-level M and N there). Sketching that out:

use M;
use N only;
...N.x...
...N.fn()...

results in:

{ // scope making module symbols themselves available
  ...module M...
  ...module N...
  { // scope making module contents available
    ...module [M.]N...
    ...<nothing from N>...
    {  // scope in which the `use` statements appeared
      ...N.x...
      ...N.fn()...
    }
  }
}

So then when the compiler resolves N.whatever, it sees module M.N as being closer to scope than top-level N.

Interestingly, I tried dumping both sets of symbols into the same scope as part of #13930, but this resulted in problems when a module and its symbols shared the same name, which is what made me ask how important it was in #13925. It seemed that this was pretty important/standard practice in other languages which is why I reverted to the traditional interpretation above.

With respect to this point:

  1. no symbol renaming

Chapel does have symbol renaming now for everything other than the module whose contents the use is making available, right? Or am I misunderstanding what you mean here?

Specifically, note that if a case like 2 and 4 came up in real life, symbol renaming could be used to help avoid the confusion:

module M {
  module N {
    var x = 100;
  }
}

module N {
  var x = 42;
}

{
  use N only;                          
  use M only N as N2;
  writeln(N.x); // Always prints 42.                                            
  writeln(N2.x);
}

Of course, it would be better if use could also rename the used module itself as you proposed in #10799 (and which I regret not getting implemented in this release cycle) so that both N symbols could be renamed rather than just one of them.

And of course, for programming in the large cases for which use is arguably inappropriately permissive, the expectation is that the import statement design will be much better suited due to their restrictive-by-default nature.

bradcray added a commit to bradcray/chapel that referenced this issue Sep 6, 2019
Bryant pointed out that the behavior of these tests has varied
over the past few releases/months which motivated me to capture
them to make sure further variation doesn't happen unexpectedly
(I believe the current behavior is correct, or at least, as
currently intended).
@bradcray
Copy link
Member

bradcray commented Sep 6, 2019

(Since these patterns are interesting, have recently varied in behavior as you've noted, and are behaving as I believe we intend them to according to present rules, I captured them in PR #14018 to prevent further changes from occurring unintentionally).

bradcray added a commit that referenced this issue Sep 6, 2019
Capture some more sample programs from issue #11262

[trivial, not reviewed]

Bryant pointed out that the behavior of these tests has varied
over the past few releases/months which motivated me to capture
them to make sure further variation doesn't happen unexpectedly
(I believe the current behavior is correct, or at least, as
currently intended).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants