-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module search path ordering and name conflicts with bundled modules #25306
Comments
I think as much as we can, we should stick with the behavior stabilized in 2.0.
My personal opinion (if we were starting fresh on this decision) is:
I have a vague memory of us deciding that "yes, adding new things (functions, modules) will break user codes but no, that doesn't mean we aren't allowed to do it in the future", but I can't seem to track down that conversation. I remember it being distinct from things like new keywords (where it's a lot more fuzzy if we're allowed to do that), and possibly more okay when in a module that isn't included by default (and thus more limited uses/imports of the module's contents are possible to maintain old behavior, but of course that doesn't help with top-level module adds).
Any opinion I want to form on this is more strongly influenced by what we've already stabilized. I don't think a particular strategy is more important than having made a decision. |
Thanks for your thoughts, @lydia-duncan. Regarding whether or not the current behavior needs to be preserved for stabilization reasons, I'm not so sure. Here are the two parts of my rationale for "maybe not":
|
Another point about whether or not the current behavior is stable: the details of the module search path are currently only documented in a technote: https://chapel-lang.org/docs/technotes/module_search.html . Whether we added unstable warnings for it or not, it being in a technote rather than the spec is meant to communicate it is not stable. |
PR #25125 changes We should continue to discuss here what we want to happen in these cases. |
This PR adjusts the integration between the frontend library and the rest of the compiler. It changes it so that only live modules are converted from uAST to AST and so that this process occurs in the module initialization order. This is expected to reduce the number of fixups required. There are a few user-facing changes in various edge cases: * This PR also changes how `require "something.chpl";` interacts with the process of determining the main module & live modules. `require` statements no longer impact the determination of the main module & a module brought in by `require` will actually be dead-code-eliminated if it is not `use` or `import`ed. See also issue #25112 for discussion of these changes. * This PR changes the behavior of `test/modules/bradc/userInsteadOfStandard` which is meant to show what happens if a user inadvertently creates a module with the same name as a standard module. This PR changes the ambiguous module chosen from `./ChapelIO.chpl` to the standard library one and adds an unstable warning for such ambiguous module situations. See issue #25306 for discussion about what should happen in these cases. * The compiler now gives "surprising shadowing" warnings for cases where an identifier could refer to an enclosing module name or something brought in by a `use` statement. * In some cases, errors in modules that are not `use` or `import`ed are no longer emitted. The reason for this is that these modules are eliminated earlier in compilation. * In some cases, the compiler will emit an error about not knowing what is the main module earlier in compilation before / in addition to other errors. More detail about implementation changes in this PR: * This PR adjusts the dyno scope resolver to allow resolution of things like `module M { M; }` -- here `M` can refer to the enclosing module name even without `use` or `import`. This case was previously being handled by the production scope resolver. * This PR changes the way that certain built-in types are resolved. That was necessary to make the process more able to handle convert-uast being run on various modules in a different order from what the production compiler uses. Since we already had a list of 7 builtin types that needed to be wired up when their record/class declarations were found, I went ahead and moved this mechanism to something mediated by "well-known types" processing in wellknown.h / wellknown.cpp. Then, I added `_tuple` to the list of things that are processed in this way. Note that this involved moving several global variables for well-known types to wellknown.h / wellknown.cpp from various other locations. * Since I removed the global variable `currentModuleType`, pass the current module tag to `buildManageStmt`. * I moved the `nestedName` warning from `checkUast.cpp` to `post-parse-checks.cpp` (note that `checkUast.cpp` checks the production AST just after it was created by converting the uAST to it & shoul[d be renamed to `check-generated-ast.cpp`). Since this warning works with the `ImplicitFileModule` warning, I adjusted `parseAndConvert.cpp` to delay them both, so they can be issued together. * I added a global variable `gConvertFilterModuleIds` for the live module analysis to communicate to `convert-uast.cpp` which modules should be converted. Using a global variable in this way is not ideally and ideally this would be managed by having convert-uast.cpp only convert one module at a time. * Most of the changes are in `parseAndConvert.cpp`. Now the dyno code / frontend is in charge of deciding what to parse and convert and that process works with `chpl::parsing::findMainAndCommandLineModules` and `chpl::resolution::moduleInitializationOrder` to decide the order in which modules should be converted. I ran into some challenges where other parts of the compiler assume that certain modules appear early in `allModules` or that their DefExprs appear early in a traversal over `theProgram`. `processInternalModules` works around these issues by rearranging the key modules to appear in an order that matches what the rest of the compiler is used to. * I added some string utility functions (`startsWith` and `replacePrefix`) in a new `string-utils.h` / `string-utils.cpp` in the frontend library. * I added a `Context::error` & related methods to various Error handling classes to accept an `IdOrLocation` in order to wire up errors that indicate `<command line>` as the location for the error. * I added a `libraryMode` to `findMainAndCommandLineModules` to hide errors about the main module when creating a library. I also added some helper functions `checkFileExists` `getExistingFileInDirectory` `getExistingFileInModuleSearchPath` to support some operations using the module search path. * I adjusted `Scope` to track if it contains a `require` statement. This allows `resolveVisibilityStmts` to more accurately filter on which `Scope`s have meaningful work to do in order to resolve use/import/require. * I added `deduplicateSamePaths` to `filesystem.h` / `filesystem.cpp` that uses LLVM Support library functions to detect and remove paths that are redundant because they refer to the same filesystem element. * I adjusted `findMainModuleImpl` to consult the modules loaded as library files as a potential source for the main module. I also improved the error messages from this function. * I tidied up `addCommandLineFileDirectories` & adjusted it to put the command line file directories and `-M` paths before the paths from `CHPL_MODULE_PATH`. * I added `doLookupEnclosingModuleName` to implement lookup for cases like `module M { M; }`. I ran into problems with certain files named a keyword, so this uses `isReservedIdentifier` to ignore module name lookups for implicit modules with those names. It would be nice to adjust the lookup process to be more robust in this way, but this is the simple solution. See also #19197 which tracks problems in this area. * I adjusted `resolveVisibilityStmts` process `require` with the module search directory if there are no `/`s in the path. I also adjusted it to process modules loaded by `require`. This addresses a pattern currently used in `test/parsing/errors/nameLength` where the main file `require`s another, which `require`s a third file; and then the main file `use`s a module expected to be made available by this nested `require`. Reviewed by @DanilaFe - thanks! - [x] full comm=none testing - [x] full `CHPL_COMM=gasnet` testing
In my recent work to bring more of the logic about what to parse to the frontend library in PR #25125, I ran into the test
userInsteadOfStandard
. In some discussion with @bradcray I learned about the purpose of this test and that brought up some design questions & I'm creating this issue to discuss these questions. The questions have to do with the order in which something likeuse Foo
searches forFoo
in the standard/internal/package modules or in user paths.Search Path Ordering
The dyno frontend is searching for modules in the following order:
modules/packages
; distributionsThis is different from the production compiler's search order:
This leads to the change in behavior for
userInsteadOfStandard
with PR #25125.Key Questions
modules/
directory (including standard, internal, package modules)?modules/
be searched for before or after user modules from-M
/.
/ same directory as other code?A Brief History of
TimeuserInsteadOfStandardtest/modules/bradc/userInsteadOfStandard/
was added in 2009 in 0cbefbd. The purpose of this test is to check that we have behavior we like for the scenario in which a user unwittingly names a module the same as a standard/internal/package module. Originally, the test had aMath.chpl
that the test is imagining was created by a user who didn't know that there is aMath.chpl
in the standard library. The test has afoo2.chpl
thatuse
s the module with the conflicting name.Note that the purpose of this test is not to test that we can replace a standard library module in the field. We have
--prepend-standard-module-dir
/--prepend-internal-module-dir
for that & as far as I know, there is agreement that these flags (or similar flags) should be required to get the replace-standard-library behavior. (I and others have been confused about this in the past).It is not immediately obvious to me if this test is intended to focus only on automatically included modules. However, the questions it raises apply either way.
Commit 557eba1 (also in 2009) added a mechanism to the compiler to process standard/internal modules separately from user modules, so that the
use
statements in the standard/internal modules only search in the standard/internal module paths (and not the user directories). The idea was that in a case like this, the compiler could actually have twoMath
modules; one from the user's directory and one from the standard library. It would rename one of these, internally.I think this logic worked for this test until PR #19306, which accepted a temporary change in behavior for this test, in order to make progress on other issues. After that:
Math.chpl
toAutoMath.chpl
on the justification that the test was added to exercise a name conflict with an included-by-default module & so this preserved the behavior while we were pulling out the automatically included part ofMath
toAutoMath
.AutoMath
,Errors
,ChapelIO
andTypes
; and also adding variousproc
s to the user'sAutoMath.chpl
to get things to compile (because there was only oneAutoMath
module & the user's one was replacing the standard one; likely due to my own confusion about the purpose of this test)c_*alloc
functions with unstableallocate
#22358 changed it toChapelIO
for the conflicting module name due to changes in behavior resulting from changes in module initialization order. It continued to add someproc
s to keep standard library code compiling since the singleChapelIO
module the compiler gets is the one from this test rather than the standard library.Issue #23100 describes a related issue with a user module named
search.chpl
conflicting withmodules/packages/Search.chpl
on case-insensitive filesystems.Summary of Discussion
Recall the key questions:
I'm aware of 3 strategies here. I'll assume that the conflicting module will be named
XYZ.chpl
for the purpose of discussion below (although it has beenMath
,AutoMath
, andChapelIO
at various times foruserInsteadOfStandard
).A: Work with the bundled module
The idea here is that there can only be one module with a given name. Even if there is a user module with that same name,
use XYZ;
needs to refer to the bundled one because otherwise other bundled code or other library code (say in another mason package) depending on that bundled module will break.Pros:
Cons:
Compiling
userInsteadOfStandard/foo2.chpl
would result in this compiler output:B: Work with the user's module
The idea here is that there can only be one module with a given name, but that the user's code needs to keep working if possible, and therefore the user's module should be used. Note that this means that other bundled code or other library code (say in another mason package) depending on that bundled module will break because they will try to use things from the bundled module that don't exist in the user's module. This is what we have now for 2.1 & in my opinion this option is untenable.
Pros:
module/packages/Buffers.chpl
)Cons:
Compiling
userInsteadOfStandard/foo2.chpl
would result in this compiler output:(The test as-written adds
proc
s to./ChapelIO.chpl
to avoid the errors above, but I don't think that's reasonable to expect in the inadverdent-same-name case; more generally with a moduleXYZ
this will work as long as the bundledXYZ
is not used in the compilation).C: Work with both
The compiler should work with two modules with the same name, where one is a bundled module, and one is not. This strategy was used in 557eba1 but has since stopped working on
main
. The idea here would be to improve the compiler in some way to support this.Pros:
Cons:
Math
; that might work at first but will cause problems in an application that uses that library and also wants to useproc sin
from the bundledMath
libraryXYZ
and depends on another Chapel library that later adds a top-level module namedXYZ
Compiling
userInsteadOfStandard/foo2.chpl
would result in this compiler output:About "there can only be one top-level module with a given name in a given compilation"
We have made a number of decisions recently that seem to double-down on the idea that there can only be one top-level module with a given name. Here are a few examples:
From #7847 (comment) :
From #8470 (comment)
Issue #12923 proposed having different module search paths per-module, but this proposal was dismissed.
Issue #19312 concluded with the decision that it's not possible to define a user module that shadows an automatically-included symbol. That issue is saying that it's not generally possible to make a symbol with the same name as something from the automatically included modules; it is similar to the question of if it's possible to make a module with the same name as a bundled module.
The text was updated successfully, but these errors were encountered: