-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we be able to extend a module from outside the module? #10796
Comments
At the risk of revealing myself to be the caveman that I am: What's a motivating use case for this feature? |
Very large programs, if I understand it correctly. Basically, a program has become large enough that you want to spread it across multiple files, but you still want all of its contents to get included via a single use. |
Note that you can get that behavior somewhat today due to our transitive use statements. However, this is understandably a little gross to some people. |
If you take a look at a package like SciPy you see that it's spread out over a lot of different sub-packages. in python, you can |
(In case it's not clear by me putting this in the Icebox, I don't consider this an urgent issue by any means. I was mostly logging it because I remembered it during a conversation) |
If the only goal is to break a module across multiple files, I'd prefer to do so via an equivalent to C's module M {
proc foo() { ... }
include "bar.chpl"; // defines a procedure bar()
include "baz.chpl"; // defines a procedure baz()
} |
Once you say "subpackages" that implies to me something like hierarchical structure (say, modules defined within other modules), but I don't read this issue as relating to that question at all. It seems more related to defining something outside of a module's scope than it does to how multiple modules relate to one another (?). |
(Since @BryantLam and @LouisJenkinsCS gave this a thumbs-up, I'm curious about their answers to "What's a motivating use case for this feature?" too...) |
I'd worry a little that we would have too many strategies with that plus |
These all seem to have very different roles to me:
Specifically, I don't think any of these can be used to do what the others are intended for, so don't view them as distinct strategies for achieving a given thing. |
I filed issue #10909 to capture the desire for an |
I'm okay with either this or When given a choice, I'm in favor of whatever is more explicit, provided it doesn't lead to excessive boilerplate. #10909 seems okay to me but I'm already accustomed to writing |
One advantage to extending the module is that it makes the module hierarchy in each file apparent and clear. If I have a package:
What gets Whereas the alternative:
has no ambiguity. ... Though this does muck up the idea of file-level modules and why implicit behavior can constrain language design (or put more cognitive burden on a user). |
Me too, so I may share any biases here... I realize that your questions may be partially rhetorical, but to try and answer them:
The idea in issue #10909 would be that the In answering the other questions, I'll be addressing a few potential misunderstandings that I've been mildly worried about (from side conversations) lately:
Don't think of Chapel as "introducing an automatic module at every file scope." Rather, think of it as "If a file that's been supplied to the compiler contains top-level code other than comments and w.r.t. the interaction with the module A {
}
include "B1.chpl"; you'd get a file-scope module named module A {
}
proc getTime() { ... }
You've used this phrase a few times recently which concerns me that there may be a misunderstanding. I believe that the only global namespace Chapel has is the one that defines the names of all the top-level modules, and for this reason, I tend not to use the phrase "global namespace" w.r.t. Chapel programs (in fact, I try to avoid using "global" at all, though not always successfully...). What do you mean when you use the phrase? Does it show up in documentation somewhere? To attempt to answer the question, I'm assuming that if |
To state my reservations for the feature request proposed in this issue: Since all Chapel code is defined within the context of some module, code like the following makes me nervous: module Outer {
module Inner { ... }
proc Inner.additionalFunction() { ... }
} Specifically, my intuition upon reaching the It also seems like it introduces a bit of an ambiguity to the reader since seeing a declaration in isolation: proc Foo.bar() { ... } It's hard to tell whether this is adding a secondary method to a class/record |
But the same can be said about making a call to bar in a complex program (where the definition of Foo is far enough away that the answer to the question "what is Foo?" is not obvious). We already use the same syntax to access "a function defined in another module" and "a method on an instance". It seems like allowing their definitions to follow the same pattern as well is removing a special case. |
I have the opposite reaction—that it would be adding a special case. While it's not always obvious what specific routine a call However, it also seems different to say "I understand that proc Foo.bar() { ... } if use Foo;
bar(); Whereas if var myFoo = new Foo();
myFoo.bar(); This seems like it's adding a new flavor of syntactic ambiguity (and, in my mind, unnecessarily, due to the lack of motivating use cases and its other drawbacks like my mental churn around "I'm defining this procedure in module Bar but it's not actually a part of module Bar"). Maybe put a different way: Currently we don't have a way of "injecting" new code into a module from outside of that module and I'm not convinced that we should add such a capability because I think it adds complexity / mental churn, both to the interpretation of Chapel programs, and to its implementation (e.g., I'm anticipating what it will do to the resolution rules if I use module Taken to the extreme, would we want to support things like: module M {
}
module M2 {
config const M.verbose = true; // add a config const, not to this module, but to M
class M.C { ... } // add a class definition not to this module, but to M
enum M.color { red, green, blue }; // add an enum not to this module, but to M
} My reaction is "ugh, no way" because it feels like this is going the opposite direction of well-structured programming (granted, the |
But we get that behavior with secondary methods defined in modules outside of where the type is originally defined, so the machinery is likely already there. But maybe that argues that we made a mistake in allowing secondary methods to be defined in that way? I'm mostly just exploring the thinking, I don't know if I actually think we should do this. |
I'd argue that such cases inject the secondary method into the type itself (as potentially governed by the module in which the secondary method is defined), not into the module defining the type. Looking at a concrete example: module M1 {
var g1: int;
class C { ... }
}
module M2 {
var g2: int;
proc C.foo() {
...g1... // illegal without a `use` of `M1` or `M1` prefix because C.foo is defined in module M2
...g2... // legal, through normal lexical scoping
}
} I'd consider Conversely, if we permit code to be injected into another module, my assumption is that accesses to globals would behave as follows: module M1 {
var g1: int;
}
module M2 {
var g2: int;
proc M1.foo() {
...g1... // legal because foo() is actually being defined in M1
...g2... // illegal because foo() is actually being defined in M1 so isn't within the lexical scope of M2
}
} This just seems weird to me, not to mention challenging to implement (we have enough scope resolution problems as it is). |
Your syntax choice in the recent examples is a little weird to me. I would expect something more like: module M {
var g1: int;
}
module M {
var g2: int;
proc foo() {
...g1... // legal
...g2... // legal
};
} and would disallow extensions via definition-through-module-prefix. I don't consider modules/namespaces to be similar at all to types, so the syntax choices between both don't have to be similar either. Edit: I just realized the original post is what you were arguing against. I think I agree; it's a little extreme whereas this post is more of an easy lift (without consideration for file-resolution concerns). |
I'm counting the change of your thumbs-up emoji into a confused emoji as my victory for the day. :) If I'm understanding your example correctly: module M {
var g1: int;
}
module M {
var g2: int;
proc foo() {
...g1... // legal
...g2... // legal
};
} I think you're saying that a Chapel program should be able to define multiple top-level modules with the same name and that their contents should all be unioned into a single module with that name? This seems pretty weird to me, though it does address my two main concerns about this issue's proposal. What are the motivations for it? Is there precedent for it in other languages? Trying to explain why I think it's weird: Let's say you and I are developing two modules independently and they just happen to have the same name. It seems odd that they would get merged into one thing since they're logically independent pieces of code. And it seems potentially dangerous in that they may interfere with one another once merged (What if they both define a top-level config or type with the same name? What if they each provide overloads that are problematic in the presence of the others'). Instead, I'd expect this case to generate a "Hey, you have two top-level modules of the same name -- do something about that!" type of error, similar to what should happen when declaring two variables or two classes of the same name at the same scope. One other more minor technical concern is what the behavior of the following should be: module M {
writeln("In first module M!");
proc deinit() {
writeln("Tearing down first module M!");
}
}
module M {
writeln("In second module M!");
proc deinit() {
writeln("Tearing down second module M!");
}
} where I suppose one answer is "both modules may not define top-level executable code / functions with the same name, including deinit()." (so this would be an error not because there are two module M's but because both try to define the same function(s). |
I don't think this code is an accurate comparison: module M1 {
var g1: int;
class C { ... }
}
module M2 {
var g2: int;
proc C.foo() {
...g1... // illegal without a `use` of `M1` or `M1` prefix because C.foo is defined in module M2
...g2... // legal, through normal lexical scoping
}
} In your module example, you're accessing symbols at the scope that is being extended. In this example, you're trying to access symbols at the outer scope of the scope that is being extended. I don't think you'd argue that C.foo shouldn't be allowed to see C's fields, or methods defined on C in the original location. Similarly, extensions to the module should be allowed to see its globals and other functions. In this argument, I'm treating the module like a singleton class (fields easily correspond to globals at the module scope, and module level functions to methods, especially when referred to from outside the module scope). |
In proposing the strategy I did at the beginning of this thread, I was not intending to eliminate strategies like C++'s namespaces. The overall strategy is definitely separable from whether we should allow any way of extending a module that has already been defined (perhaps it should have been split off into two separate threads, though I suspect the discussion would have taken place on one anyways). The similarities between modules and singleton classes made me wonder if strategies that were already in place should be extended. |
Oh, I think I misunderstood your previous analogy to secondary methods then. I think I understand it better now that you've made the connection to singleton classes. Given that perspective, how about this class-based analogy in which we try to define a secondary method on one class from within a distinct class: class C {
var x: int;
proc foo() { ... }
}
class D {
var y: int;
proc C.bar() { // inject a method `bar()` into C
...
}
} This is something we don't currently support, and I think this was a good choice (e.g., it leads to questions about whether C.bar() can refer to [addendum: And while it's possible to write code outside of any classes to add a secondary method, it's not possible to declare Chapel code that exists outside of any modules (currently at least... and I'd be reluctant to change that)]. |
I also want to get back to motivating examples, though. Without them, I feel like we've spent a lot of time on a reasonably esoteric topic without a concrete "I'd like to use it for this" use case in hand to motivate the discussion. If the only motivation is to be able to break a module into multiple files, then I think the |
I think that's reasonable :) I don't have a more specific example myself, so I think we can fall back on Bryant's earlier comment that either strategy would work for him (so we can go with the |
I think the original motivating use-case does want something hierarchical like submodules. I believe submodules would work for @buddha314's example of SciPy with the exception of submodules being constrained to a single file (SciPy is ~270k lines of python code, for reference). From what I can tell, an // Top.chpl
module Top {
include "A.chpl"
include "B.chpl"
}
// A.chpl
module A {
var x = 1;
} // B.chpl
module B {
var x = 2;
} // User code
use Top;
writeln(Top.A.x); // 1
writeln(Top.B.x); // 2 A mechanism developed specifically for defining submodules in separate files might look cleaner in the end (using module names like |
I'm not up to date on every detail in this issue, but I think that #13524 combined with #13979 addresses the need adequately. The difference with that combination and what this issue is originally proposing is that the submodule/re-export assumes that the contents of a namespace (i.e. module) are determined by the author(s) of that module; while in this issue those contents could be extended arbitrarily. However both would allow splitting up large projects into many files. |
A quick search wasn't showing me an issue for this, though I know it's been mentioned in passing before by at least @buddha314.
Today, a module can only contain symbols defined within its braces. Other languages provide the ability to extend namespaces, but Chapel hasn't typically done so. Is this something we should provide?
Potential syntax:
Similar to how secondary methods can be defined using the type name as a prefix, we could use the module name as a prefix when defining a function. For instance:
We would want to carefully consider the impact of this on private symbols - my thinking is that symbols defined outside the original scope of the module cannot access symbols that are private to the module, unless such private symbols are defined at a similar scope, e.g.
However, that may remove some of the desire for this feature.
The text was updated successfully, but these errors were encountered: