Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename a module on use statement #10799

Closed
BryantLam opened this issue Aug 16, 2018 · 17 comments · Fixed by #14714
Closed

Rename a module on use statement #10799

BryantLam opened this issue Aug 16, 2018 · 17 comments · Fixed by #14714

Comments

@BryantLam
Copy link

Related #10798

Feature request. I want the ability to rename a module, especially if my module is CrazyLongName.

Some syntax choices:

1.

use CrazyLongName only CrazyLongName as cln;

Works with today's parser, but I don't like the verbosity. If there's no reason to make this illegal, I'm not opposed to supporting it simply for symmetry when using submodules:

use CrazyLongName only SubModule as sm;

2.

use CrazyLongName as cln;

3.

use CrazyLongName only as cln;

Is only redundant? If use-and-rename, does the user intend to still pollute the global namespace?

@mppf
Copy link
Member

mppf commented Aug 17, 2018

Generally sounds like a good idea to me.

I like 1 for submodules and 2 for top-level modules. Having 1 work also for top-level modules would seem to break some assumed nesting to me, so I'd prefer not.

@lydia-duncan lydia-duncan self-assigned this Aug 17, 2018
@cassella
Copy link
Contributor

cassella commented Aug 18, 2018

#3900 proposed the syntax in choice 2, and added a future for it.

edit to note: there was a lot of discussion in the PR comments there.

@lydia-duncan
Copy link
Member

That would be why I didn't find it, it was a PR!

@mppf
Copy link
Member

mppf commented Oct 4, 2019

See also #13831 which includes some discussion of renaming a module on an import and also on a use.

@bradcray
Copy link
Member

I agree that approach 2 is very attractive.

1a doesn't seem like it ought to work for me (I'd expect only / except clauses to only apply to module-level symbols within the module being used, not to the module name itself). In a quick check, it seems to me that it doesn't work in 1.20, so maybe we consider this a bug that has now been fixed (maybe in the "top-level modules must be used effort?). If @BryantLam or others find a variation of this that works where mine didn't, that'd be worth posting a complete example for.

1b seems attractive and seems to work today (this was my hope in reading the issue, but I couldn't tell from the text whether that was the case at the time or not).

I'm not as fond of approach 3.

Somewhat related, in issue #13978, I propose a variation on approach 2 that would prevent the module name itself from being referenced: use M as _;

@lydia-duncan
Copy link
Member

I've started implementing approach 2 and in writing tests, I thought of a couple of cases where there might be disagreement on the intended behavior. What do folks think should be the intended behavior in these cases?

  1. The renamed module matches the name of another module that was used:
module LongName {
  var x: int;
}
module L {
  var y: bool;
}
module User {
  use L;
  use LongName as L;

  proc main() {
    writeln(L.x); // This seems like it should fail because L is now ambiguous
    // I could also see the argument for not allowing the second use to be written in the first place
    // though we otherwise allow multiple uses of the same module (due to except/only list
    // overlaps)
    writeln(L.y);
  }
}
  1. The new module name matches the name of a symbol in the scope with the use
module LongName {
  var x: int;
}
module User {
  use LongName as L;

  enum L { red, blue, yellow };

  proc main() {
    writeln(L.x); // Uses are at a further scope than symbols in the scope, does that
    // mean this should error due to not finding 'x' in the enum?  What if the enum defined
    // an 'x'?
  }
}
  1. The new module name matches the name of a symbol in the module being used
module Foo {
  var Bar: string;
}
module User {
  use Foo as Bar;

  proc main() {
    writeln(Bar.Bar); // Should this work or fail?
  }
}

Note: if the module name already matches the name of a symbol in the module,
we will fail to access symbols in the module with qualified naming and will instead
prefer fields and methods on the symbol defined in the module, e.g.:

module Foo {
  var Foo: string;
}
module User {
  use Foo;

  proc main() {
    writeln(Foo.Foo); // This line will fail
  }
}
  1. Both the module being used and a symbol in the module are renamed
module Foo {
  var x: int;
}

module User {
  use Foo as Bar only x as y;

  proc main() {
    writeln(y);       // Foo.x
    writeln(Bar.x); // Should Foo.x be accessed like this?
    writeln(Bar.y); // Or like this?  Or should both be valid?
    // Note that today if you *just* rename the symbol, it needs to use the old name for
    // qualified naming, e.g. Foo.x, so I lean towards Bar.x
  }
}

@bradcray
Copy link
Member

bradcray commented Dec 5, 2019

The renamed module matches the name of another module that was used

I think this should behave similarly to defining multiple variables of the same name at the same scope.

The new module name matches the name of a symbol in the scope with the use

Ditto

The new module name matches the name of a symbol in the module being used

My intuition is that this should behave the same as it would if Foo contained a symbol named Foo and the use case used use Foo; and writeln(Foo.Foo); (i.e., replace all instances of Bar with Foo and remove the renaming of the module).

Both the module being used and a symbol in the module are renamed

My intuition is that Bar.y has to be legal, otherwise the use statement would seem to have no point. By similar reasoning, I'd make Bar.x illegal.

At first, I didn't balk at the naked reference to y, but looking at it again after thinking about the previous two cases, I'm not sure it should be permitted by this use statement. I.e., I'm thinking that part of the intention of using this form of module renaming may be to require fully qualified names for the symbols in question? (It's not obvious to me from @BryantLam's original description of the issue whether that's what he intended, but I could imagine that being a valid interpretation of his request).

@lydia-duncan
Copy link
Member

I.e., I'm thinking that part of the intention of using this form of module renaming may be to require fully qualified names for the symbols in question?

Does that imply that you think unqualified access of the symbols in the module being used and renamed should not be allowed? E.g.

use Mod as M; // would be redundant with:
use Mod as M only; // and thus this form potentially shouldn't be even provided?

That seems confusing to me - I would rather write:

use A only;
use B as C only;

and have those mean the same thing than have

use A only;
use B as C;

mean the same thing. I think we would be sacrificing clarity for brevity.

@bradcray
Copy link
Member

bradcray commented Dec 5, 2019

Hmm, that's a good point. Let's ignore the last paragraph of my previous comment for now and I'll stew a bit more on what I'm feeling hung up on.

@BryantLam
Copy link
Author

BryantLam commented Dec 6, 2019

  1. The renamed module matches the name of another module that was used:

I would prefer the second use statement to be illegal.

I think the difference with use L; use L; multiple times is that the compiler can know that L is the same module in both use statements, whereas with use LongName as L, the compiler knows that the actual symbol is LongName and there already is some module L that isn't LongName in scope.

Similarly:

use LongName as L;
use LongName as L;

Is this fine? I would think so because both L point to LongName as the actual L and the compiler knows it in both cases.

(I don't actually know how the compiler does this resolution, so I could be way off base to how it actually works. This is simply how I think it should work.)

  1. The new module name matches the name of a symbol in the scope with the use

Uses are at a further scope than symbols in the scope

Can you remind me the reason why the behavior is like this? I remember it being explained before, but I don't remember why it behaves like this because it's not intuitive to me., especially if the expectation of use statements is to "bring symbols into scope" (but not really, because they're actually scope-adjacent).

I would prefer if the behavior was that both the use statement and the definition of enum L are treated as in the same scope. Then this problem is apparently resolved as an illegal redefinition of symbol L when defining enum L.

  1. The new module name matches the name of a symbol in the module being used

This is a consequence of use-statements allowing both qualified and unqualified access #13978. My opinion is that qualified access should always be preferred over unqualified access to get to a symbol because then this problem doesn't happen. (There might be some other problem I can't think of.)

I also wonder why this behavior works this way instead of first resolving qualified access.

  1. Both the module being used and a symbol in the module are renamed

use Foo as Bar only x as y;

writeln(y);       // OK:  Foo.x
writeln(Bar.x); // Illegal:  x was renamed to y.
writeln(Bar.y); // OK:  This is the only way to get to Foo.x.

// Note that today if you just rename the symbol, it needs to use the old name for
// qualified naming, e.g. Foo.x, so I lean towards Bar.x

module L {
  var x: int = 42;
}

use L only x as y;
writeln(y);
writeln(L.y); // <-- Error today.

That seems confusing. The user asked to rename x to y, so why is L.y not the only symbol that exists? L.x shouldn't be resolvable. If you take Brad's suggestion of use L as _ to disable unqualified access, the behavior would be entirely wrong:

import L only x as y;
writeln(y); // Illegal. Qualified access is required.
writeln(L.y); // Also illegal! L.x, not L.y!

That's confusing as heck.

Edit (January 27, 2020): I no longer believe the behavior I outlined in this post to be the correct path. Please ignore most of it and see #14738 (comment) for followup.

@lydia-duncan
Copy link
Member

  1. The renamed module matches the name of another module that was used:

I would prefer the second use statement to be illegal.

I think that's reasonable. The compiler is not set up for it today (because it didn't need to, any naming conflict would be a result of modules found when trying to resolve the use statement, not a name the use statement defines), but I think with the advent of "renaming while using" it probably should be set up to validate that.

  1. The new module name matches the name of a symbol in the scope with the use

Uses are at a further scope than symbols in the scope

Can you remind me the reason why the behavior is like this? I remember it being explained before, but I don't remember why it behaves like this because it's not intuitive to me., especially if the expectation of use statements is to "bring symbols into scope" (but not really, because they're actually scope-adjacent).

The intention is to prevent symbols in a used module from conflicting with or shadowing symbols in the current scope that share a name. So in the following code:

module A {
  var x: int;
}
module B {
   use A;
   var x: bool;

  proc main() {
      writeln(x);
  }
}

x correctly refers to the boolean, rather than causing a compiler conflict because it doesn't know whether to use B.x or A.x. The question about renaming, though, is which scope it should be considered to live in. There's a few places to put it:

module A { // Option 1
   var x: int; // Option 2
}
module B {
  use A as C; // Option 3
  ...
}

Note: I think we've been thinking about uses from the perspective of option 3, but that option 1 is what we actually have implemented for normal module uses.

Option 1 treats the rename as though it was the original name of the module. From B's perspective, A was always C (though A will still know it is A within its own scope and for other uses of A). This means that the answers to the cases in my previous post are:

  1. The renamed module matches the name of another module that was used: CONFLICTS
  • Though resolution would know that these are separate modules and so theoretically unqualified access could be allowed between them, qualified access will result in a conflict
  1. The new module name matches the name of a symbol in the scope with the use: FAVORS THE ENUM
  • Though resolution would not get confused and include the enum's symbols in scope instead of the module's, the new name will never be useful. We could allow this, or just have it be an error/warning since it probably was a mistake.
  1. The new module name matches the name of a symbol in the module being used: FAVORS THE SYMBOL IN THE MODULE
  • This matches the current behavior when the module is not renamed, and to me explains why pretty natural. The new name is treated as though it was always the name, so of course it behaves like an unrenamed module
  1. Both the module being used and a symbol in the module are renamed: SHOULD MATCH UNRENAMED CASE
  • I'm kinda wondering if we made the wrong choice about qualified naming when a symbol being brought in is renamed. I hadn't really though about it from this perspective when implementing it, so maybe it makes more sense to allow the renamed symbol to be accessed with the new name as part of qualified naming? Regardless, this option should match the behavior when the module being used is not renamed.

Option 2 is probably the strategy I least favor. I included it for completeness, but I'm not sure there's much justification for it.

  1. The renamed module matches the name of another module that was used: FAVOR RENAMED MODULE?
  • The new name is included as part of the module's symbol, so is technically quicker to reach than an unrenamed module. Perhaps that should imply a precedence for it over the unrenamed module?
  1. The new module name matches the name of a symbol in the scope with the use: FAVORS THE ENUM
  • Ditto option 1
  1. The new module name matches the name of a symbol in the module being used: CONFLICTS
  • Since the new name is included as part of the module's symbols, we can't determine which one was meant and so throw up our hands. The user will have to use a different name if it wants either qualified access or the symbol with the same name.
  1. Both the module being used and a symbol in the module are renamed: SHOULD MATCH UNRENAMED CASE
  • Ditto option 1

Option 3 is I think what Brad is arguing for, and also matches what we've said in offline meetings. A use is considered as adding a symbol definition to the scope with the name of the module. For comparison, option 1 feels more like "jump to this scope with this lens on it and go up to find the module name", while this option feels more like "look at the used module name first always, then look at the symbols it defines".

  1. The renamed module matches the name of another module that was used: DEPENDS
  • If we consider unrenamed modules to be treated in the same way, this would cause a conflict (I think everyone favors this strategy)
  • If we think unrenamed modules should be treated differently, then maybe the renamed module will take precedent? (I don't know that anyone thinks this way, but I'm including it for completeness)
  1. The new module name matches the name of a symbol in the scope with the use: CONFLICTS
  • They're defined at the same scope so we can't decide which one takes precedent
  1. The new module name matches the name of a symbol in the module being used: MODULE NAME TAKES PRECEDENT
  • If we go by a pure "use is at the same scope as symbols defined in a scope" and "symbols from a used module are at a slightly further scope than symbols in the module with the use", then that to me indicates that the rename should take precedence over the symbols within the module
  1. Both the module being used and a symbol in the module are renamed: SHOULD MATCH UNRENAMED CASE
  • Ditto option 1

This is a consequence of use-statements allowing both qualified and unqualified access #13978. My opinion is that qualified access should always be preferred over unqualified access to get to a symbol because then this problem doesn't happen. (There might be some other problem I can't think of.)

I also wonder why this behavior works this way instead of first resolving qualified access.

Hopefully my answer to the last section addressed this, but to draw connections together:
I think we work this way today because uses follow Option 1 for the most part - we jump from the caller of the use into the scope that is being used and go up to get the module name. The main exception is in renaming a symbol in a module - in those cases we treat the scope accessed with the module name differently than we treat the symbol we bring in for unqualified access, and give qualified access a lower precedence. I'm feeling more and more like that special case is wrong, but that doesn't necessarily mean Option 1 or Option 3 is better.

If you take Brad's suggestion of use L as _ to disable unqualified access, the behavior would be entirely wrong

I think you have this backwards - use L as _ would disable qualified access. _ implies that it removes the possibility of naming the module.

Summary

I think we've been thinking about uses in a different way than the compiler actually implements them. If I'm remembering right, Option 3 is more in keeping with the way other languages have defined their use/import statements. However, if we intend to change and follow Option 3, Case 3 from my previous post seems to be inconsistent with that definition. I personally feel more inclined towards Option 1 as it seems more true to what we have said about the use statements, since the focus before now has really been on the symbols they bring in. However, I am comfortable with switching to Option 3 - adding the capacity for the user to have control over the qualified name seems like an appropriate point to rethink this behavior to me, since the new name needs to be defined somewhere. (I also started this comment leaning more towards option 3 but ended up shifting part way through.)

@BryantLam
Copy link
Author

BryantLam commented Dec 7, 2019

I think you have this backwards - use L as _ would disable qualified access.

Whoops! You're right! My point still stands and I amended my previous post to make it more clear I was proxying import statements instead.

--

  1. The new module name matches the name of a symbol in the module being used: MODULE NAME TAKES PRECEDENT
  • If we go by a pure "use is at the same scope as symbols defined in a scope" and "symbols from a used module are at a slightly further scope than symbols in the module with the use", then that to me indicates that the rename should take precedence over the symbols within the module

I think this behavior is exactly how I see modules being brought in. The use statement itself declares the module name in the same scope, but its symbols are slightly farther away, which still allows shadowing for the symbols in the module, but not the module symbol itself.

So, I agree with Option 3 the most.

For case 3: conservatively, I would force the user to fully qualify the symbol if it's named the same as the module symbol (Bar.Bar instead of just [Bar.]Bar). As a feature addition, maybe it's reasonable for the compiler to infer from symbol usage whether a module symbol or some non-module symbol is less ambiguous to use, but there's certainly gotchas with this.

@lydia-duncan
Copy link
Member

One thing that occurs to me as I implement this guidance is that changing the precedent for "module name" versus "symbol in the module" will especially impact types that share the same name as their module. The change itself is relatively simple to make, but it will mean that things like our usage of LocaleModels will need to adjust, because the LocaleModel subclass is defined within a module with the same name.

There may be other cases remaining like this, as well.

Is this acceptable fallout? A sign that we should maintain current behavior instead? Or a sign that maybe uses that rename symbols should be treated differently than uses that don't, in this regard?

@bradcray
Copy link
Member

Is this acceptable fallout?

[I'm posting this with the admission that I'm behind on this thread since my last comment, but caught wind of this specific question today]

I feel as though having a type named M within a module named M be usable in an easy way is a frequently requested feature, and a big part of the reason that we treat the scoping of module symbols in a slightly specialized way today (e.g., symbols from a use are just outside of the current scope, and their modules' symbols are just outside of that). It makes me nervous to change the language in that regard without great need (which may exist and I just don't understand it yet), both because I think it's worked well for us and because it's seemed to support the M-within-M pattern well and naturally.

@lydia-duncan
Copy link
Member

I think the main issue is that it's inconsistent. "Symbols from a use are just outside of the current scope, and their modules' symbols are just outside of that", but we still want to the renamed module symbols to conflict with both symbols defined at the scope with the use (meaning that the rename is at a closer scope than the symbols within the used module) and with other used modules (but those are supposed to be beyond symbols within the used module).

@bradcray
Copy link
Member

I see. Reconsidering my previous answers in this light:

  • For case 1 (module rename conflicts with module), I'd expect both L symbols to belong to the same "just outside of that" scope, so expect that we could support a conflict for them since they're ending up in the same scope. (Alternatively, we could permit both L's to exist simultaneously, which would have the effect of seemingly extending an existing module by renaming another module to have the same name... if that was considered attractive).

  • Thinking about case 2 again (module rename conflicts with other symbol), it seems that this case should behave the same as if the original module was named L and not renamed. I.e., the enum could/should shadow that module name. If this is a problem for the author of this code, they probably should not have chosen to use the same name for both symbols (since both are under their control).

  • For case 3, I think my answer would be the same as before.

  • And for case 4, the same as we've been discussing on issue https://github.com/Cray/chapel-private/issues/631

Does that help address the inconsistency?

@lydia-duncan
Copy link
Member

Yes, that helps a ton :)

lydia-duncan added a commit that referenced this issue Jan 13, 2020
Allow renaming a module on its use statement
[reviewed by @mppf]

This enables support for renaming a module when using it.  With this change,
the following code will work, so long as Foo has a public symbol named `x` in
it:

```chapel
use Foo as Bar;

writeln(Bar.x); // Refers to Foo.x
```

Prior to this change, we could only rename a module if it was a submodule of a
module we were using, e.g.
```chapel
use FooParent only Foo as Bar;
use Bar;

writeln(Bar.x); // Refers to Foo.x
```
Otherwise, we were unable to rename the module.

Note that this does not change unqualified access of symbols within the
module (nor was it expected to)

Resolves #10799, Cray/chapel-private#529

Details:
-------
Added a new field to UseStmts which is set by a new argument to the
constructor, indicating the new name for the module being used.  Adds
two new methods to UseStmts as well, to determine if the module is
renamed in this use, and to obtain that rename.

Updated the OnlyRename parser structure to PotentialRename, to reflect
that it is not just used for only lists.  Uses this structure at parse time to
pass the new name/old name pair for the used module to buildUseStmt.
Allows both a use of multiple modules and a use of modules with `except`
or `only`.

Extends checkIfModuleNameMatches to account for renaming, and extends
it to allow for renaming an enum when using it.

Requires a future change to improve error messages in the case where a
module has been renamed and we failed to resolve a symbol in it for
qualified naming

Updates the AST loggers for this new syntax.

Covered test cases:
--------------------
- Renaming to a shorter name
- Renaming to a longer name
- Trying to use the old name for qualified access
- Renaming to the same name as another module that wasn't used
- Renaming to the same name as another module that was used:
  - with qualified access
  - with unqualified access only
- Renaming to the same name as an enum in scope
- Renaming an enum when using it
- when a symbol is excluded in a use that renames the module, we should not be
able to use the symbol in an unqualified manner
- we should be able to rename a module to the same original name.
- renaming a module that was renamed in another module you used
  - Note: not currently implemented, see #14711 
- renaming with an `except *;` clause, too
  - for qualified access
  - for unqualified access
- renaming with an `only;` clause, too
  - for qualified access
  - for unqualified access
- renaming both a module and a symbol inside it in the same use
  - with proper qualified access
  - with proper unqualified access
  - with improper qualified access (a few variations)
  - with improper unqualified access
- renaming a module with an only list in the same use
  - with qualified and unqualified access
- a client tries to use an unqualified symbol from a privately used module
- a client tries to use a qualified symbol from a privately used module with
the original module name
- a client tries to use a qualified symbol from a privately used module with
the new module name
- Ensures that publicly using a module while renaming it impacts clients of the
module with the use the same as the original module:
  - for unqualified access
  - for qualified access (both proper and improper)
- publicly using and renaming a private module
  - when using the renamed module as a prefix to access its symbols
  - when using the old module name as a prefix to access its symbols
- Ensures that you can rename multiple modules in a use, and that all modules
will:
  - have their new name respected
  - have their symbols available unqualified
  - and have their old name not usable.
- using functions from renamed modules
- renaming when new name is also the name of a symbol in the module
  - and using just the new name
  - and using qualified access of the symbol
- renaming to a symbol defined in another module
  - and using just the new name
  - and using qualified access to try and get a symbol in the first module
- when a module is used twice in different ways
  - and is renamed once
  - and is renamed in both uses

Testing:
--------
- [x] full paratest with futures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants