-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should calling functions in a submodule requires a use? #13536
Comments
Regarding this reason for "yes"
Early in my history of learning Chapel, I was personally pretty confused by the fact that putting many modules in a single file actually makes them submodules within an implicit module, i.e. // in M.chpl
module L {
proc lFunction() { }
}
module N {
proc main() {
L.lFunction(); // why does this work?
}
} I remember being very surprised by Choosing "yes" here would reduce the potential for this particular case of confusion. (another strategy to handle that case of confusion would be to create more errors for implicit modules containing certain patterns - as with #6813 - e.g. implicit modules that contain only multiple module statements). |
I'm following this conversation, but I'm not sure I have a clear opinion just yet |
My immediate response was to draw parallels from enums and what enum TestEnum { Foo, Bar }
writeln(TestEnum.Foo); // you can do this without `use`
use TestEnum; //You need the `use` to use the enum without its name
writeln(Foo); So, maybe a module could use its submodules functions as long as the submodule name is there, but with |
@e-kayrakli - I don't think it makes any sense to
In that event, if you want to write a call like |
Some thoughts. My main argument for "no" is that it would be annoying for library authors. public module MyLibrary {
public module Component {}
private module Details {}
proc callApi() {
}
proc callApi2() {
}
} If I was disciplined, I would put my module X {
import Y; module Y {
// ...
}
} and the problem explodes from there for each descendant submodule. In favor of "yes" is that the usage of modules would be more consistent, but I think it'll just be annoying / a usability issue.
|
If Vs. if the import is required, |
Ah thanks. I didn't understand what argument you were making. That's a good point, though I agree with you that it's somewhat a weak argument as the authors of these conflicting modules are usually the same person/team. |
As a point of clarification, it wasn't entirely clear to me in the OP what the "yes" and "no" positions referred to since the original question was phrased as "Should we do a, or b?" (and in fact, I think they are backwards from my assumption which is that "yes" would mean "we should do a"). Starting with the "no" arguments:
I don't know if I buy this argument. If a sub-module can be defined in another file as we hope, it seems just as likely that I would pull in a non-inline (sub)module that someone else had written as write it myself (?).
Is that strange? Can't L see things from M simply through lexical scoping? To me, the main argument in favor of "no" (assuming it means "you can't simply refer to the submodule") is that by inspecting the source code, it seems apparent that the symbol in question is visible and in scope, so it seems a little artificial not to be able to refer to it. I think of
I hate to say it, but it sounds like you still don't have this rule right. :) The compiler only inserts an implicit module in the case that a file contains module-level statements (e.g., any statements other than module declarations and comments) at file scope. Such statements have to belong to some module, so the compiler treats the file as their module (which also supports conveniences like being able to write an entire program as file-scope code as in scripting languages). If the file only contains module declarations and comments at file scope, no implicit module is inserted. So code like your example above: M.chpl module L { }
module N { } results in a module namespace hierarchy like this:
whereas if you were to write: M.chpl
then you'd get the module hierarchy:
That said, I think your argument here probably still stands in that the reason your program wouldn't work if the modules were in separate files is that there's nothing in your definition of [For historical context: The most traditional point of confusion w.r.t. implicit modules was that users coming from C tended to equate them with use M;
module R {
writeln("In R");
} and then be confused that On the "yes" side, you say:
Maybe I'm not understanding what you mean by "interact directly with", but if I wrote: use M except *;
M.foo(); it seems to me as though |
Sorry that was confusing, "yes" and "no" were answering the question in the title. I've updated it to hopefully be clearer.
Now I got confused by the parenthetical... But, I updated the "no" section with your argument.
Thanks for correcting me :) I think this part of the issue has more to do with #13523 and so I've moved further discussion of that example to #13523 (comment) .
Right, but I'm saying the Note that if we did my proposal in #13119 (comment) this code would have to be import M;
M.foo(); since Put another way, the reason that Along these lines, one reason it is appealing to require submodules be |
I was curious about what Python does here. As I understand it, in Python, you can (only) declare a submodule as something within a package. And, once you do that, the submodule is only visible within the parent module/package if there is an import:
def Foo():
print("In Foo")
Foo.foo()
This can be fixed by changing import spam.foo
foo.Foo() (using the fact that package paths are not relative by default) or from . import foo
foo.Foo() using an explicit relative import. The Python example makes me wonder if there is a connection between the choice in this issue and whether or not imports are relative or absolute by default... but I don't see a tight connection between these two. |
I've been mentally playing with this idea (as suggested by my latest comment on #13523) and am trying to get my head around what its implications would be. As a result, this comment is going to span a few of these related issues a bit (but I'm putting it here since this is where I first grokked Michael's idea). Starting with the case of sibling modules from #13523: module M {
proc foo() { }
}
module N {
M.foo();
} Traditionally, we've said that module M {
proc foo() { }
}
module N {
use M /* as M */;
M.foo();
} at which point the By analogy, this suggests that submodules could not refer to their siblings without a module M {
module S1 {
proc foo();
}
module S2 {
S1.foo(); // requires a `use S1 /* as S1 */;` in order to be resolvable
}
} (where today they can by similar rules: "looking up in my scope, I see Presumably, we'd make a special case for a module's contents being able to refer to itself (?). That is, the following would be legal: module M {
proc foo() { ... }
M.foo();
} even though I haven't done a module M {
use M /* as M */;
...
} which also arguably rationalizes why a sub-module could refer to its parent module's symbols: module M {
module S {
M.foo(); // OK due to implicit `use M;` within `module M`
}
proc foo() { ... }
} I think this pattern suggests to me that using a sub-module's symbols would also require a module M {
// implicit `use M as M;` here
module S {
// implicit `use S as S;` here
proc foo() { ... }
}
S.foo(); // I can see `module S` but that isn't good enough to let me refer to `S`, so error
} and: module M {
// implicit `use M as M;` here
module S {
// implicit `use S as S;` here
proc foo() { ... }
}
use S;
S.foo(); // I can see the `S` introduced by `use S` which in turn refers to `module S` so this works
} So that suggests to me taking the "yes" approach for consistency. And I do think that it helps with the "submodules may be defined in other files and be invisible-ish" problem (more on that in a sec). Then again, as Bryant notes, it may get really annoying and really old fast. If we went with this philosophy (nothing can refer to a module's symbols other than A somewhat middle ground solution (that might be too weird to argue passionately for) might be to say that the implicit Thus, given: M.chpl: module M {
module S {
}
// auto-inject-files-from-the-magical-directory;
} and N.chpl (which happens to be in the magical directory): module N {
} would turn into: module M {
// implicit `use M as M;` -- put here since it's the topmost SW module in M.chpl where M was defined
// implicit use S as S except *; — put here since it's the topmost SW module in M.chpl where S was defined; the except `*` is necessary to avoid making all of S's contents directly available.
module S {
}
module N {
// implicit use N as N; — put here since it was the topmost SW module in N.chpl where N was defined
}
} Thus, within |
I'd be okay with this approach. I don't mind a more restrictive implementation that leaves room to relax later. Once you break submodules into separate files, the consistency makes sense, if only to make certain to the compiler which modules or submodules it should look to
This observation is incredibly relevant to #13524 (comment). If we "inject-known-dir", one way to write that statement (instead of
|
@BryantLam - interesting point. Another way to say it is -
We are basically saying that |
Require 'use' of top level modules rather than lexical scoping [reviewed by @mppf and in part by @lydia-duncan] This PR implements an idea proposed on issue #13523 in which top-level modules are not visible to user code unless a `use` of that top-level module is within lexical scope. Traditionally, top-level modules have been considered to be placed into the global / program / top-level scope, and therefore their bodies have been able to refer directly to one another through normal lexical scoping. In essence, this new world makes that program scope "dimly lit" such that you can't see into it to refer to the module unless a `use` statement is present to light your way. As discussed on issue #13523, the net effect is a less surprising language because expressions like `M.foo()` don't happen to resolve or not based on whether a file defining module M was named on the compile line, happened to be in a file with another module, or was found in the search path when another module `use`d it. Now if you don't "see" a `use` of a top-level M when you look up through your lexical scope, you can't refer to it. At present, this new rule is only implemented for top-level modules (I believe), though issue #13536 asks whether it should be applied to submodules as well. The main outcome of this change is that many modules and tests needed to be updated because they were relying on the old world rules. The net result is cleaner code. In the modules directory, I strived to take the approach of using local `use` statements when possible and there were just a few references to the module and module-scope `private use` statements when it was not possible or there were many references to the module. For modules that were using fully qualified references, I tended to use `only ;` to make the full qualification still be meaningful; in others, I didn't. The heart of the change is a simple conditional in the lookupNameLocally() routine that essentially pretends it doesn't see a top-level module name unless we're resolving uses: ```chapel // don't consider top-level modules to be visible unless this is a use if (toModuleSymbol(sym) == NULL || this != rootScope || isUse) { retval = sym; } ``` Probably the most impactful change to the compiler is that I began threading the `isUse` boolean of the above code through a number of the routines used for lookup to determine whether resolving their symbols could "see" the top-level module symbols or not. This is fairly mechanical, but ends up touching many lines of code. I also added a similar `isTopLevel` boolean to the ResolveScope::extend() routine in order to preserve the behavior of generating an error when two top-level modules with the same name were created. This bool gets threaded into the `isUse` logic above, and without it, the compiler would arbitrarily pick the last such similarly-named module. Another big change was to have the skipSymbolSearch() routine return a pointer to the module symbol in the event that (a) the only clause did not list the name being searched for and (b) the module name matches. This permits a case like `use M only ;` to match against "M" where before the match would be done by looking upward in the lexical scope. I also: * added a utility function in UseStmt that checks to see whether the name of the module being used matches a particular string * stored the rootModule's ResolveScope in a global variable because it seemed too useful to remain a local variable (and I did make use of it, although only once) * made a minor adjustment to a method resolution error to name the base type in question * removed a double-`use` of `LAPACK` in `LinearAlgebra.chpl` that didn't seem to be adding any value, and which causes a problem when using fully qualified references currently. I filed a future asking how we wanted to handle such cases. If we think of `use M` as adding a symbol `M` to the current scope, then doing it twice could be considered an error. But I don't know if that's natural / surprising and it seems like it could be... * made minor adjustments to the description of `use` in the spec and to one of the introductory examples * tried to improve an error message which seemed confusing because it referred to the "use of a module" but was not referring to the `use` statement... * retired a future fillRandom.chpl * added a number of new tests distilling cases that tripped me up in larger tests down to their simplest form * added variants of existing tests whose behaviors had changed in order to lock things in even better I think one of the biggest questions for me with this PR is whether I added a special case for top-level modules in a place that later fixes or improvements rendered unnecessary. To that end, in the coming week, I'd like to remove certain segments to see whether that is the case (and add better comments for all that remain). Resolves #13925 Resolves #13523
I view the examples brought up in #11262 in this comment as evidence that requiring a use for submodules is a good idea. module M {
module M {
proc whatev() {
writeln("whee");
}
}
}
use M only M;
M.whatev(); // currently, this M refers to the outermost sub-module M above I think it's confusing that the |
But this behavior is consistent with that of the following code: class Foo {
proc type whatev() {
writeln("In class Foo's whatev");
}
}
module Bar {
module Foo {
proc whatev() {
writeln("In module Foo's whatev");
}
}
}
use Bar only Foo;
Foo.whatev(); // prints "In class Foo's whatev" An |
I think what will really help is allowing us to rename modules when they are in a |
I don't think that
Suppose that we required submodules be used in the original example and we made module M {
module M {
proc whatev() {
writeln("whee");
}
}
}
// M not in scope after the above because submodules must be `use`d
use M only M; // outer M not in scope after this because use does not bring in module symbol
M.whatev(); // here, M can refer only to the inner module Even if we decided to go a different way on #13978, we could still get this example to work if the submodules had to be used. We would just need to control whether the module itself from a |
The point I'm trying to make is that there being consistency between modules and classes in this way is a way to explain away what you found confusing. Having them behave differently will cause confusion for people that think of classes as basically modules that you can make instances of (or modules as basically singleton classes, take your pick). I think that viewpoint is useful and something we should avoid breaking which is part of why I object to making module names be treated differently from other symbols defined in the same scope. To convince me otherwise, you're going to have to argue why that paradigm is not valuable. |
A long time ago I proposed that modules and classes be more similar and mixable, so that you could say So in the current language, I don't think we have any obligation at all to make modules and classes behave the same. They don't behave the same now. You can't (and as far as I know, never will be able to) I do think that your argument is reasonable. I would rephrase it as this:
However, I don't agree with this argument. I think that the current situation is that the scoping rules for modules specifically is already too confusing, and that making the scoping for modules have fewer cases (e.g. module names are only made available by However this is a judgement call (we are deciding - which is more confusing?) and I suspect we aren't going to convince each other at all. For my part, I would be more convinced by your argument if you could As the language stands now, if we continue to prefer to "keep module scoping like class scoping" rather than "simplify module scoping with use/import statements" then I think we are building the language around what is more of an unusual/corner case for modules - when there are submodules that aren't |
I've lost track... If we were to take a quick straw poll on this issue, where are people currently falling? Please give one of the following a thumbs-up. |
Upvote this comment if you think that one shouldn't be able to refer to a submodule's name without a |
Upvote this comment if you think that one should be able to refer to a submodule's name if simply visible via lexical scoping / without first having a |
Upvote this comment if you're still undecided. |
I've started on a branch to implement this, just to see what kinds of impacts it has. |
I voted for allowing to refer to a submodule's name when visible because it is consistent with what we do with other things, like variables and functions names. Analogously, it should be legal for a submodule to refer to symbols in the enclosing scopes, to match the behavior of other kinds of nested scopes. In the following code, I'd draw a parallel between var x = 1;
writeln(x); // no need to 'use x'
if ... {
writeln(x); // x comes from the enclosing scope
} |
@vasslitvinov - I am curious if you'd apply the same argument to #13523, e.g. if the below is in one file: module L {
proc lFunction() { }
}
module N {
proc main() {
L.lFunction(); // should this work?
}
} then I would imagine the same argument would argue that "L is in scope". Put another way, the parallels-with-other-declarations argument does not apply for the above case, so why should it apply for submodules? My view on this matter is that modules are fundamentally different from other symbols in terms of scoping (see #13536 (comment) for some elaboration on that) and that it's more of a benefit to usability to limit the ways a module's symbol name is in scope than it is to make it similar to the other cases. |
@mppf either I am not follow all the intricacies, or I am just outright not convinced. In the example of two modules side by side: module L { ... }
module N {
// is L visible here?
} I propose to apply my parallel-to-var-decl principle -- so yes, To me this offers clean and simple semantics. Remark: in this setup Remark: inability to |
I believe this is only correct if there are other symbols at the top level besides module definitions (I played with this recently for other reasons). When only modules are defined at the top level, there is no overarching file module scope inserted. |
Vass, I think Michael's point is that his example is precisely what we decided to stop supporting in issue #13523 and PR #13930 because it was too fragile and fraught with confusion (note that Michael's example only uses top-level modules, not sub-modules). So then the natural question is "given the decision #13523, why should sub-modules be different?" (I think there could be an argument there for why they should be different, but it may need to be something different than "you can see the module name looking upwards in its lexical scope" since that rule arguably applies to top-level modules as well).
Lydia's correct in this. I missed that you assumed that there was a parent module to L and N. |
I did not realize there was no implicit module above L+N, as Lydia describes. At least I am upholding my view for submodules. For top-level L and N, the case when they are in separate files is different IMO than when they are in the same file. I suspect much of motivation in #13523 does not apply when the modules are in the same file. In any case, we are not revisiting #13523 here. So, I will rephrase the question "why should sub-modules be different?" to "why should top-level modules be different?" My answer being "because top-level modules can be in different files". Maybe when sub-modules start being in different files, maybe they, too, will need to be |
In fact that is part of the reason that this issue exists. It was in some ways spun off of #13524 which proposes submodules in different files. I think we should make the decision about this issue knowing that we expect to support submodules in different files.
I'm not sure what more information we'd need around the submodules-in-different-files proposal to answer this question (the main part, IMO, is that such functionality is likely to be added). What information would you want to know about submodules-in-different-files to inform this issue specifically? I think you are proposing that top-level modules / sub-modules in different files have different behavior than if they are in a single file. I do not think that is a good idea, since moving a module/submodule out to a different file shouldn't change the way the program behaves (it should be possible to do simply as a code reorganization). |
I agree with this and almost said so yesterday before waiting to see how the |
To motivate my approach, think of Likewise, if a submodule is already visible,
Sure. However adding a little When I see something like: var x ...;
module Y {...} it does not make sense that I have to |
You lose me at "in another file" because I don't think language semantics should be so deeply related to the code's organization into files.
I think it does make sense and wouldn't seem as surprising if we'd been living with these rules for years. I think it takes some getting used to given our history with the language because it fundamentally says "module symbols are treated differently than other symbols." But I think that that's OK. |
The bug report in #11868 is related. Of course we could seek to solve it another way, but requiring a use/import of the submodule would fix the bug. It's not the strongest reason but it does suggest that requiring a |
Note, we don't expect to give the same treatment to parent modules, so parent modules won't need to be used/imported, so the "import is the only thing interacting with module names" Pro does not apply. |
PR #15278 takes a quick stab at this if anyone wants to see the impact, think about it, weigh the tradeoffs, etc. I haven't put in the effort to get all tests working, but have gotten most that aren't specifically wrestling with corner cases of visibility and scoping working. |
I'm currently leaning against requiring submodules to be used/imported in order to access their names. It feels unnecessarily laborious, and there will always be something within the parent module to indicate the submodule's name and that it is there (either its declaration or a statement that brings it in from a file). So adding the requirement that it also be used/imported feels like busywork to me at present. |
Today we effectively decided not to pursue this. |
This is related to #13523 and #13524.
If we have a module and a submodule, like this:
Should a use/import be required for
L.lFunction()
to work?L.lFunction()
is not allowed without ause L
/import L
L.lFunction()
works without ause L
/import L
Arguments in favor of "yes":
use
andimport
actually the only things that interact directly with module names - although some of these would then re-export the module name. This might make it easier/more robust to do things likeimport M as OtherModule
.import
/use
, submodule or no.import
/use
when the submodule is defined in another file (a-la Submodules in different files #13524) and this would give a clear place to indicatepublic import
vsprivate import
.Arguments in favor of "no":
module L
declaration creates a symbol that is visible and in scope within M. Changing this would make themodule
declarations less consistent with variable, type, and proc declarations.M
using things inL
to require a use but not forL
using things inM
to do so, e.g.:The text was updated successfully, but these errors were encountered: