Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should modules be able to define symbols that share their name? #13925

Closed
bradcray opened this issue Aug 29, 2019 · 11 comments · Fixed by #13930
Closed

Should modules be able to define symbols that share their name? #13925

bradcray opened this issue Aug 29, 2019 · 11 comments · Fixed by #13930

Comments

@bradcray
Copy link
Member

This issue asks whether it should be illegal to declare a module-level symbol with the same identifier as the module itself, such as:

module M {
  var M...;  // or proc M() ... or any other declaration of M
  var N...;
}

Doing so introduces interesting implementation challenges as I work on the change proposed in issue #13523 to not permit references to top-level modules unless they have been used. These challenges may all be surmountable with effort... but that effort is definitely slowing me down and making me wonder whether this pattern is sufficiently valuable to be worth the effort (whereas creating an error for it would be much simpler and let me rewrite code that relies on this pattern rather than trying to keep it working).

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

All that said, I'm completely willing to keep this behavior working if people think it's valuable (or simply weird / arbitrary to disallow it).

@bradcray
Copy link
Member Author

@mppf, @lydia-duncan, @ben-albrecht, @BryantLam: Any thoughts here? (singling you out based on your participation on issue #13523).

@bradcray
Copy link
Member Author

Here's another interesting case (abstracted from UnitTest.chpl):

module M {
  use Sub;

  ...Sub...  // Check this out

  module Sub {
    class Sub {
    }
  }
}

Should the line with the comment refer to module Sub (which was used in that scope, so available) or class Sub (which was made available via the use). Or should it just be illegal (too confusing to be worth putting effort into supporting).

(Note that this is also a potential motivating case for supporting a public use of a private module as discussed on issue #13528... I.e., the author probably didn't want the user to be able to refer to module Sub?).

@ben-albrecht
Copy link
Member

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

IIRC, there were some other pitfalls I encountered using this pattern, which has led me to stop using it. I'll try to reproduce those and mention them here when time permits.

All that said, I'm completely willing to keep this behavior working if people think it's valuable (or simply weird / arbitrary to disallow it).

I have personally found this behavior valuable in other languages, but tend to avoid using it in Chapel due to the pitfalls it introduces. If we don't expect the pitfalls around using shared module/symbol names to improve, then it would make sense to forbid it altogether.

Additionally, it may make sense to be more restrictive now, given the aim for Chapel 2.0 release candidate, and maybe relax this rule later if we can make it safer to use.

Relatedly, I haven't found a great naming scheme to deal with this. It is usually suggested to make the module name plural, but that isn't always applicable. For example, if the name has an odd plural form (e.g. Octopus -> Octopi, or DNA -> DNAs), or if the module only contains a single symbol (like a class definition) such that the plural name doesn't make sense.

@ben-albrecht
Copy link
Member

Should the line with the comment refer to module Sub (which was used in that scope, so available) or class Sub (which was made available via the use).

Not sure if this is the right interpretation, but if I think about use Sub as from Sub import *, then I'd expect Sub to refer to the class Sub. If use Sub only had been used, then I would expect Sub to refer to the module Sub and Sub.Sub to refer to the class.

@bradcray
Copy link
Member Author

Issue #11262 was an issue I meant to link to above that relates to this question.

@bradcray
Copy link
Member Author

Thanks for the notes @ben-albrecht: Knowing that you use and like this pattern in other languages is good incentive to push through the challenges. If others have contrary views, I'm still open to those, but this is good incentive to not throw in the towel on these cases tonight.

Your description of what you'd expect to happen in the various use Sub cases is also helpful (and I assume based on Python). Just to make sure I'm not missing something, this means that if one had the Sub-within-Sub pattern in Python and used from Sub import *, they would not have a way to express a fully qualified Sub.Sub, is that right?

@ben-albrecht
Copy link
Member

Just to make sure I'm not missing something, this means that if one had the Sub-within-Sub pattern in Python and used from Sub import *, they would not have a way to express a fully qualified Sub.Sub, is that right?

That's correct.

@BryantLam
Copy link

BryantLam commented Aug 30, 2019

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

The symmetry argument makes the most sense to me because it is difficult to use the M module symbol in any meaningful context without the dot operator as a disambiguation, ... though now that I say that, I can see an issue with nesting since the dot operator is overloaded for both path resolution and referring to fields. As a user, I would expect a naked Sub in all contexts except use or import statements to be class Sub.

Here's an example from Rust. It does the "right thing" because it's pretty difficult to use the module symbol in an ambiguous way; field access is via dot operator but module scope traversal is done with the :: operator.

pub mod M {
    pub mod Sub {
        pub static Sub: i32 = 42;
    }
}

fn main() {
    use M::Sub::*; // all items in `mod Sub`, but not `mod Sub` itself
    use M::Sub;    // only `mod Sub`, but none of its items

    let x = Sub;
    let y = Sub::Sub;
    println!("{} {}", x, y); // compiles and prints "42 42"
}

In most expressions, I think Rust and Chapel have enough similarities where the symbol is disambiguated as not-a-module.

IMO, I hope to never have to use in the future because unqualified accesses and qualified accesses from the same statement make it difficult to read and/or trace for reasons like this.

@bradcray
Copy link
Member Author

I've now got this working on my branch (at least as well as it was before), so there's no need to not support this pattern assuming people agree with Ben's interpretations.

@mppf
Copy link
Member

mppf commented Aug 30, 2019

module M {
  var M...;  // or proc M() ... or any other declaration of M
  var N...;
}

We talked about the contents of module M having an implicit import M to make M available as a regular symbol inside the module. I think we can just imagine that this is in an outer scope compared to the body of M:

module M {
  import M; // notional, not a real thing a user would type
  {
    var M:int;
   }
}

This interpretation would allow the pattern above and also has clear meaning.

To generalize this rule, we might expect that local symbols can override module symbols always (i.e. use and import are always considered weaker bindings of the name). I suspect something like this is what is/was actually implemented.

@lydia-duncan
Copy link
Member

I think everything I would have said has been said already :)

bradcray added a commit that referenced this issue Aug 30, 2019
Require 'use' of top level modules rather than lexical scoping

[reviewed by @mppf and in part by @lydia-duncan]

This PR implements an idea proposed on issue #13523 in which top-level modules are not visible to user code unless a `use` of that top-level module is within lexical scope.  Traditionally, top-level modules have been considered to be placed into the global / program / top-level scope, and therefore their bodies have been able to refer directly to one another through normal lexical scoping.  In essence, this new world makes that program scope "dimly lit" such that you can't see into it to refer to the module unless a `use` statement is present to light your way.

As discussed on issue #13523, the net effect is a less surprising language because expressions like `M.foo()` don't happen to resolve or not based on whether a file defining module M was named on the compile line, happened to be in a file with another module, or was found in the search path when another module `use`d it.  Now if you don't "see" a `use` of a top-level M when you look up through your lexical scope, you can't refer to it.

At present, this new rule is only implemented for top-level modules (I believe), though issue #13536 asks whether it should be applied to submodules as well.

The main outcome of this change is that many modules and tests needed to be updated because they were relying on the old world rules.  The net result is cleaner code.  In the modules directory, I strived to take the approach of using local `use` statements when possible and there were just a few references to the module and module-scope `private use` statements when it was not possible or there were many references to the module.  For modules that were using fully qualified references, I tended to use `only ;` to make the full qualification still be meaningful; in others, I didn't.

The heart of the change is a simple conditional in the lookupNameLocally() routine that essentially pretends it doesn't see a top-level module name unless we're resolving uses:

```chapel
   // don't consider top-level modules to be visible unless this is a use
   if (toModuleSymbol(sym) == NULL || this != rootScope || isUse) {
     retval = sym;
   }

```

Probably the most impactful change to the compiler is that I began threading the `isUse` boolean of the above code through a number of the routines used for lookup to determine whether resolving their symbols could "see" the top-level module symbols or not.  This is fairly mechanical, but ends up touching many lines of code.  I also added a similar `isTopLevel` boolean to the ResolveScope::extend() routine in order to preserve the behavior of generating an error when two top-level modules with the same name were created.  This bool gets threaded into the `isUse` logic above, and without it, the compiler would arbitrarily pick the last such similarly-named module.

Another big change was to have the skipSymbolSearch() routine return a pointer to the module symbol in the event that (a) the only clause did not list the name being searched for and (b) the module name matches.  This permits a case like `use M only ;` to match against "M" where before the match would be done by looking upward in the lexical scope.

I also:
* added a utility function in UseStmt that checks to see whether the name of the module being used matches a particular string
* stored the rootModule's ResolveScope in a global variable because it seemed too useful to remain a local variable (and I did make use of it, although only once)
* made a minor adjustment to a method resolution error to name the base type in question
* removed a double-`use` of `LAPACK` in `LinearAlgebra.chpl` that didn't seem to be adding any value, and which causes a problem when using fully qualified references currently.  I filed a future asking how we wanted to handle such cases.  If we think of `use M` as adding a symbol `M` to the current scope, then doing it twice could be considered an error.  But I don't know if that's natural / surprising and it seems like it could be...
* made minor adjustments to the description of `use` in the spec and to one of the introductory examples
* tried to improve an error message which seemed confusing because it referred to the "use of a module" but was not referring to the `use` statement...
* retired a future fillRandom.chpl
* added a number of new tests distilling cases that tripped me up in larger tests down to their simplest form
* added variants of existing tests whose behaviors had changed in order to lock things in even better

I think one of the biggest questions for me with this PR is whether I added a special case for top-level modules in a place that later fixes or improvements rendered unnecessary.  To that end, in the coming week, I'd like to remove certain segments to see whether that is the case (and add better comments for all that remain).

Resolves #13925 
Resolves #13523
mppf added a commit that referenced this issue May 10, 2022
simplify use/import shadowing

This PR describes and implements some candidate language adjustments to
shadowing behavior for use and import statements. We need to do something
in this area because the definition of shadowing is currently
inconsistent between variables and functions (#19167). This PR attempts
to simplify the language design in this area.

The adjustments to the language design in this PR are as follows:
 * isMoreVisible in function disambiguation as well as scope resolution
   use the same rules for determining shadowing with use/import
   statements
 * isMoreVisible starts it search from the POI where the candidates were
   discovered (see issue #19198 -- not discussed further here)
 * private use statements still have two shadow scopes
 * public and private import statements now do not introduce a shadow
   scope
 * public use statements now do not introduce a shadow scope
 * `public use M` does not bring in `M` (but `public use M as M` does)
 * methods are no longer subject to shadowing
 
Note that the above design leads to less shadowing of symbols from the
automatic library (see the section "Less Shadowing of Symbols from the
Automatic Standard Library" in
#19306 (comment))


## Discussion

Elements of the language design direction are discussed in these issues:
 * #19167
 * #19160
 * #19219 and #13925
 * #19312
 * #19352
 
Please see also
#19306 (comment)
which discusses pros and cons of these language design choices.


### Testing and Review

Reviewed by @lydia-duncan - thanks!

This PR passes full local futures testing.

Resolves the future `test/modules/vass/use-depth.chpl`.
Also fixes #19198 and adds a test based on the case in that issue.

- [x] full local futures testing
- [x] gasnet testing

### Future Work

There are two areas of future work that we would like to address soon:
 * #19313 which relates to changes in this PR to the test
   `modules/bradc/userInsteadOfStandard/foo2`. The behavior for trying to
   replace a standard module has changed (presumably due to minor changes
   in how the usual standard modules are now available). I think that
   changing the compiler to separately list each standard module would
   bring the behavior back the way it was. (Or we could look for other
   implementation changes to support it).
 * #19780 to add warnings for cases where the difference between `public
   use M` and `private use M` might be surprising
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants