Should modules be able to define symbols that share their name? #13925

bradcray · 2019-08-29T23:08:49Z

This issue asks whether it should be illegal to declare a module-level symbol with the same identifier as the module itself, such as:

module M {
  var M...;  // or proc M() ... or any other declaration of M
  var N...;
}

Doing so introduces interesting implementation challenges as I work on the change proposed in issue #13523 to not permit references to top-level modules unless they have been used. These challenges may all be surmountable with effort... but that effort is definitely slowing me down and making me wonder whether this pattern is sufficiently valuable to be worth the effort (whereas creating an error for it would be much simpler and let me rewrite code that relies on this pattern rather than trying to keep it working).

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

All that said, I'm completely willing to keep this behavior working if people think it's valuable (or simply weird / arbitrary to disallow it).

The text was updated successfully, but these errors were encountered:

bradcray · 2019-08-29T23:10:30Z

@mppf, @lydia-duncan, @ben-albrecht, @BryantLam: Any thoughts here? (singling you out based on your participation on issue #13523).

bradcray · 2019-08-29T23:35:59Z

Here's another interesting case (abstracted from UnitTest.chpl):

module M {
  use Sub;

  ...Sub...  // Check this out

  module Sub {
    class Sub {
    }
  }
}

Should the line with the comment refer to module Sub (which was used in that scope, so available) or class Sub (which was made available via the use). Or should it just be illegal (too confusing to be worth putting effort into supporting).

(Note that this is also a potential motivating case for supporting a public use of a private module as discussed on issue #13528... I.e., the author probably didn't want the user to be able to refer to module Sub?).

ben-albrecht · 2019-08-30T00:01:16Z

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

IIRC, there were some other pitfalls I encountered using this pattern, which has led me to stop using it. I'll try to reproduce those and mention them here when time permits.

All that said, I'm completely willing to keep this behavior working if people think it's valuable (or simply weird / arbitrary to disallow it).

I have personally found this behavior valuable in other languages, but tend to avoid using it in Chapel due to the pitfalls it introduces. If we don't expect the pitfalls around using shared module/symbol names to improve, then it would make sense to forbid it altogether.

Additionally, it may make sense to be more restrictive now, given the aim for Chapel 2.0 release candidate, and maybe relax this rule later if we can make it safer to use.

Relatedly, I haven't found a great naming scheme to deal with this. It is usually suggested to make the module name plural, but that isn't always applicable. For example, if the name has an odd plural form (e.g. Octopus -> Octopi, or DNA -> DNAs), or if the module only contains a single symbol (like a class definition) such that the plural name doesn't make sense.

ben-albrecht · 2019-08-30T00:04:21Z

Should the line with the comment refer to module Sub (which was used in that scope, so available) or class Sub (which was made available via the use).

Not sure if this is the right interpretation, but if I think about use Sub as from Sub import *, then I'd expect Sub to refer to the class Sub. If use Sub only had been used, then I would expect Sub to refer to the module Sub and Sub.Sub to refer to the class.

bradcray · 2019-08-30T00:47:36Z

Issue #11262 was an issue I meant to link to above that relates to this question.

bradcray · 2019-08-30T00:57:21Z

Thanks for the notes @ben-albrecht: Knowing that you use and like this pattern in other languages is good incentive to push through the challenges. If others have contrary views, I'm still open to those, but this is good incentive to not throw in the towel on these cases tonight.

Your description of what you'd expect to happen in the various use Sub cases is also helpful (and I assume based on Python). Just to make sure I'm not missing something, this means that if one had the Sub-within-Sub pattern in Python and used from Sub import *, they would not have a way to express a fully qualified Sub.Sub, is that right?

ben-albrecht · 2019-08-30T01:06:50Z

Just to make sure I'm not missing something, this means that if one had the Sub-within-Sub pattern in Python and used from Sub import *, they would not have a way to express a fully qualified Sub.Sub, is that right?

That's correct.

BryantLam · 2019-08-30T02:52:50Z

The pattern is also a bit weird in that, for example, if the only way for me to start referring to top-level module M as proposed in #13523 is via use M; then we have to decide whether an unqualified reference to M refers to the module or the variable (or procedure or whatever). My guess there is that it should be the module. That said, it also seems a little asymmetrical that I can refer to variable N directly but not its sibling variable M without saying M.M.

The symmetry argument makes the most sense to me because it is difficult to use the M module symbol in any meaningful context without the dot operator as a disambiguation, ... though now that I say that, I can see an issue with nesting since the dot operator is overloaded for both path resolution and referring to fields. As a user, I would expect a naked Sub in all contexts except use or import statements to be class Sub.

Here's an example from Rust. It does the "right thing" because it's pretty difficult to use the module symbol in an ambiguous way; field access is via dot operator but module scope traversal is done with the :: operator.

pub mod M {
    pub mod Sub {
        pub static Sub: i32 = 42;
    }
}

fn main() {
    use M::Sub::*; // all items in `mod Sub`, but not `mod Sub` itself
    use M::Sub;    // only `mod Sub`, but none of its items

    let x = Sub;
    let y = Sub::Sub;
    println!("{} {}", x, y); // compiles and prints "42 42"
}

In most expressions, I think Rust and Chapel have enough similarities where the symbol is disambiguated as not-a-module.

IMO, I hope to never have to use in the future because unqualified accesses and qualified accesses from the same statement make it difficult to read and/or trace for reasons like this.

bradcray · 2019-08-30T06:50:46Z

I've now got this working on my branch (at least as well as it was before), so there's no need to not support this pattern assuming people agree with Ben's interpretations.

mppf · 2019-08-30T11:19:52Z

module M {
  var M...;  // or proc M() ... or any other declaration of M
  var N...;
}

We talked about the contents of module M having an implicit import M to make M available as a regular symbol inside the module. I think we can just imagine that this is in an outer scope compared to the body of M:

module M {
  import M; // notional, not a real thing a user would type
  {
    var M:int;
   }
}

This interpretation would allow the pattern above and also has clear meaning.

To generalize this rule, we might expect that local symbols can override module symbols always (i.e. use and import are always considered weaker bindings of the name). I suspect something like this is what is/was actually implemented.

lydia-duncan · 2019-08-30T16:00:39Z

I think everything I would have said has been said already :)

@mppf

Require 'use' of top level modules rather than lexical scoping [reviewed by @mppf and in part by @lydia-duncan] This PR implements an idea proposed on issue #13523 in which top-level modules are not visible to user code unless a `use` of that top-level module is within lexical scope. Traditionally, top-level modules have been considered to be placed into the global / program / top-level scope, and therefore their bodies have been able to refer directly to one another through normal lexical scoping. In essence, this new world makes that program scope "dimly lit" such that you can't see into it to refer to the module unless a `use` statement is present to light your way. As discussed on issue #13523, the net effect is a less surprising language because expressions like `M.foo()` don't happen to resolve or not based on whether a file defining module M was named on the compile line, happened to be in a file with another module, or was found in the search path when another module `use`d it. Now if you don't "see" a `use` of a top-level M when you look up through your lexical scope, you can't refer to it. At present, this new rule is only implemented for top-level modules (I believe), though issue #13536 asks whether it should be applied to submodules as well. The main outcome of this change is that many modules and tests needed to be updated because they were relying on the old world rules. The net result is cleaner code. In the modules directory, I strived to take the approach of using local `use` statements when possible and there were just a few references to the module and module-scope `private use` statements when it was not possible or there were many references to the module. For modules that were using fully qualified references, I tended to use `only ;` to make the full qualification still be meaningful; in others, I didn't. The heart of the change is a simple conditional in the lookupNameLocally() routine that essentially pretends it doesn't see a top-level module name unless we're resolving uses: ```chapel // don't consider top-level modules to be visible unless this is a use if (toModuleSymbol(sym) == NULL || this != rootScope || isUse) { retval = sym; } ``` Probably the most impactful change to the compiler is that I began threading the `isUse` boolean of the above code through a number of the routines used for lookup to determine whether resolving their symbols could "see" the top-level module symbols or not. This is fairly mechanical, but ends up touching many lines of code. I also added a similar `isTopLevel` boolean to the ResolveScope::extend() routine in order to preserve the behavior of generating an error when two top-level modules with the same name were created. This bool gets threaded into the `isUse` logic above, and without it, the compiler would arbitrarily pick the last such similarly-named module. Another big change was to have the skipSymbolSearch() routine return a pointer to the module symbol in the event that (a) the only clause did not list the name being searched for and (b) the module name matches. This permits a case like `use M only ;` to match against "M" where before the match would be done by looking upward in the lexical scope. I also: * added a utility function in UseStmt that checks to see whether the name of the module being used matches a particular string * stored the rootModule's ResolveScope in a global variable because it seemed too useful to remain a local variable (and I did make use of it, although only once) * made a minor adjustment to a method resolution error to name the base type in question * removed a double-`use` of `LAPACK` in `LinearAlgebra.chpl` that didn't seem to be adding any value, and which causes a problem when using fully qualified references currently. I filed a future asking how we wanted to handle such cases. If we think of `use M` as adding a symbol `M` to the current scope, then doing it twice could be considered an error. But I don't know if that's natural / surprising and it seems like it could be... * made minor adjustments to the description of `use` in the spec and to one of the introductory examples * tried to improve an error message which seemed confusing because it referred to the "use of a module" but was not referring to the `use` statement... * retired a future fillRandom.chpl * added a number of new tests distilling cases that tripped me up in larger tests down to their simplest form * added variants of existing tests whose behaviors had changed in order to lock things in even better I think one of the biggest questions for me with this PR is whether I added a special case for top-level modules in a place that later fixes or improvements rendered unnecessary. To that end, in the coming week, I'd like to remove certain segments to see whether that is the case (and add better comments for all that remain). Resolves #13925 Resolves #13523

@lydia-duncan

simplify use/import shadowing This PR describes and implements some candidate language adjustments to shadowing behavior for use and import statements. We need to do something in this area because the definition of shadowing is currently inconsistent between variables and functions (#19167). This PR attempts to simplify the language design in this area. The adjustments to the language design in this PR are as follows: * isMoreVisible in function disambiguation as well as scope resolution use the same rules for determining shadowing with use/import statements * isMoreVisible starts it search from the POI where the candidates were discovered (see issue #19198 -- not discussed further here) * private use statements still have two shadow scopes * public and private import statements now do not introduce a shadow scope * public use statements now do not introduce a shadow scope * `public use M` does not bring in `M` (but `public use M as M` does) * methods are no longer subject to shadowing Note that the above design leads to less shadowing of symbols from the automatic library (see the section "Less Shadowing of Symbols from the Automatic Standard Library" in #19306 (comment)) ## Discussion Elements of the language design direction are discussed in these issues: * #19167 * #19160 * #19219 and #13925 * #19312 * #19352 Please see also #19306 (comment) which discusses pros and cons of these language design choices. ### Testing and Review Reviewed by @lydia-duncan - thanks! This PR passes full local futures testing. Resolves the future `test/modules/vass/use-depth.chpl`. Also fixes #19198 and adds a test based on the case in that issue. - [x] full local futures testing - [x] gasnet testing ### Future Work There are two areas of future work that we would like to address soon: * #19313 which relates to changes in this PR to the test `modules/bradc/userInsteadOfStandard/foo2`. The behavior for trying to replace a standard module has changed (presumably due to minor changes in how the usual standard modules are now available). I think that changing the compiler to separately list each standard module would bring the behavior back the way it was. (Or we could look for other implementation changes to support it). * #19780 to add warnings for cases where the difference between `public use M` and `private use M` might be surprising

bradcray added type: Design area: Language labels Aug 29, 2019

bradcray self-assigned this Aug 29, 2019

bradcray mentioned this issue Aug 30, 2019

Require 'use' of top level modules rather than lexical scoping #13930

Merged

12 tasks

BryantLam mentioned this issue Aug 30, 2019

Unclear behavior when use statement brings in a module with the same name #11262

Closed

bradcray closed this as completed in #13930 Aug 30, 2019

BryantLam mentioned this issue Sep 4, 2019

Remove qualified access from use statements? #13978

Closed

bradcray mentioned this issue Sep 5, 2019

multiple-defined module symbol has shadowing instead of error #14014

Open

mppf mentioned this issue Feb 8, 2022

Should use statements have two shadow scopes? #19219

Closed

mppf mentioned this issue Feb 25, 2022

simplify use/import shadowing #19306

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should modules be able to define symbols that share their name? #13925

Should modules be able to define symbols that share their name? #13925

bradcray commented Aug 29, 2019

bradcray commented Aug 29, 2019

bradcray commented Aug 29, 2019

ben-albrecht commented Aug 30, 2019

ben-albrecht commented Aug 30, 2019

bradcray commented Aug 30, 2019

bradcray commented Aug 30, 2019

ben-albrecht commented Aug 30, 2019

BryantLam commented Aug 30, 2019 •

edited

Loading

bradcray commented Aug 30, 2019

mppf commented Aug 30, 2019

lydia-duncan commented Aug 30, 2019

Should modules be able to define symbols that share their name? #13925

Should modules be able to define symbols that share their name? #13925

Comments

bradcray commented Aug 29, 2019

bradcray commented Aug 29, 2019

bradcray commented Aug 29, 2019

ben-albrecht commented Aug 30, 2019

ben-albrecht commented Aug 30, 2019

bradcray commented Aug 30, 2019

bradcray commented Aug 30, 2019

ben-albrecht commented Aug 30, 2019

BryantLam commented Aug 30, 2019 • edited Loading

bradcray commented Aug 30, 2019

mppf commented Aug 30, 2019

lydia-duncan commented Aug 30, 2019

BryantLam commented Aug 30, 2019 •

edited

Loading