-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method syntax #494
Comments
Re 3, 4: It might also be worth considering whether the receiver can be deduced, like in C++'s "Deducing |
Do you mean this proposal? I will look at it. |
Precisely. It's been pretty universally popular. It allows elegant mixins and CRTP, and recursive lambdas. |
I think this shows some utility to planning for an entire type, and not just part of a type. But I hesitate to generalize too far here -- IMO a number of this proposals advantages are somewhat specific to solving problems created by C++, unsure Carbon will end up benefitting to the same degree. Still, definitely a good thing to consider ahead of time rather than after-the-fact. |
I would actually like to suggest a higher level concern I have with the method syntax. I would find all of these much more appealing if we could move the method name prior to the receiver. Specifically, I think we should strive to have an especially easily skimmable structure for APIs:
Regardless of which keyword (although I suggest using What do others think? If we want to consider that, what syntactic approaches should we consider? The syntax from "Deducing
Curious about thoughts on this approach as well. |
Chandler~
I really like `fn Name` as the invariant. Makes scanning much nicer.
Matt
…On Fri, Apr 23, 2021 at 5:51 PM Chandler Carruth ***@***.***> wrote:
I would actually like to suggest a higher level concern I have with the
method syntax.
I would find all of these much more appealing if we could move the method
*name* prior to the receiver. Specifically, I think we should strive to
have an especially easily skimmable structure for APIs:
struct IntContainer {
// Non-methods for building instances
fn MakeFromInts ... -> IntContainer;
fn MakeRepeating ... -> IntContainer;
// Methods
fn Size ... -> Int;
fn First ... -> Int;
fn Clear ...;
fn Append ...;
}
Regardless of which keyword (although I suggest using fn for both), I
really like the *next* thing being the name of the thing to ease
scanning. Having to skip over a receiver (using any of the above syntaxes)
for me really lessens the readability of the API.
What do others think?
If we want to consider that, what syntactic approaches should we consider?
The syntax from "Deducing this" works this way, but I'm interested in
whether there are other options. In particular, I continue to find that the
receiver being in the "pattern match" part of the signature doesn't fit
well, and I somewhat prefer it being written separately if possible.
However, I wonder if my opinion on that would change if we wrote it as part
of the *implicit* list? From the original example in the top comment:
fn Set[this Self* self](Int n);
Curious about thoughts on this approach as well.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#494 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAALOCUIXW3HI6DVAVR45QTTKHTWTANCNFSM43OZ7VJA>
.
|
How is the distinction between method and non-method functions intended to be indicated? IIUC the C++ equivalent is |
The thing that makes the most sense to me is to use the syntax that specifies the receiver's type. Put differently, I'm suggesting that (3) in the original summary should not be omitted (at a minimum), and that whatever syntax we use for this is sufficient to indicate a method vs. a non-method. We could alternatively rely on a separate bit of syntax like that in (5), but I prefer making the receiver type explicit and simply attempting to reduce the verbosity of that type.
I'm not sure. Maybe at least a little. My thoughts above about trying to pull out the idea of names immediately following the introducer is because types with reasonably large public APIs would have a pretty large list of these with the reader frequently skimming to find a specific name in the API. I find that these kinds of public APIs somewhat rarely have instance members as direct parts of the API. Certainly, many style guides advocate against it. Places where instance members are part of the API tend to be fairly small and/or local types that are focused on storage more than a rich API. So I don't see it as critical to align instance variable syntax structures with functions, or to try to get them to have a single name immediately after an introducer. A slightly more interesting case would be constants, which do form parts of public APIs more often. But at least in my experience, I've not found them to be so numerous that this kind of skimming structure seems essential. However, this does raise one thing we may want to think about either now or in the future: do we want to have something more like Swift's computed properties which essentially allow the data member syntax to be used while having an actual API with getter and setter logic. If we want to add such constructs, then those would seem more likely to be valuable to harmonize with the function declaration syntax to ensure we can form easily skimmed and read APIs. |
Interfaces will most commonly have methods, but they also support static functions and associated types. Right now, associated types are spelled: |
I think that could wind up being confusing, because the square brackets seem very much part of pattern matching, even though they're not part of the pattern syntax per se. In see a "conventional" usage like |
Here are a list of options that conform to @chandlerc 's constraint that the method names line up, or at least mostly line up: Proposal "AEHJL" has the property that the names all start in columns 5, 6, or 7, depending on if it is static, accessor, or mutator, which might be close enough since it is still pretty easy to visually read the names off:
Option "B" could be adopted as long as the new introducer was ~2 characters long, and the actual receiver type being specified as the first parameter type.
We have a couple variations on marking this first parameter with a keyword as in C++'s "Deducing this" proposal.
Then there is @chandlerc 's suggestion of moving that into the implicit parameters, which @geoffromer has raised concerns about, but does make the declaration look more similar to the call.
We could also use a delimiter in the parameter (are implicit parameter list), but this might be a bit subtle:
Other ideas to consider? Are any of the options more appealing than the others? |
Let me take a step back for a minute:
Do we expect there to be other differences between methods and other functions, besides the difference in declaration and call syntaxes? In particular:
|
I would like the answer to be yes.
My guess would be yes.
I currently have interfaces use different declarations for methods vs. other functions, but that is primarily because they are expected to be treated differently as the result of the answers to these other questions.
My guess would be no, except through generics.
My understanding is that we were planning to make the answer to this question be "yes." There is another difference: the receiver parameter to a method varies covariantly in inheritance unlike other parameter types. This is arguably part of the fact that methods are doing dynamic dispatch on the receiver's type. |
Option Q: other symbols to indicate passing this by pointer vs. by value
Option R:
|
In the open discussion slot, we did talk about how different it was that we might automatically take the address of the receiver when passing it into the method. One alternative is that you actually pass pointers to mutating methods:
Question: This business of the type of the receiver affecting whether we take its address sounds a lot like one of the things references do. Do we want to possibly add references? It would also help in other places such as custom If we don't want to go that far, perhaps there is a way when binding a name in a pattern indicate that we implicitly take the address. Possible syntax:
We would then use that syntax on the
|
Option S: Use different introducers for the different cases. For example,
|
I think ACFIL with AEHJL as optional syntactic sugar would be an interesting choice. That is, the full method syntax is:
... but in contexts where
Within a class definition, you'd usually be able to use the shorthand, and you get an easily scannable list of methods. Outside a class definition, we can work out which class we're defining a member of based on the receiver type, so the syntax is not much more verbose than a declaration without a receiver type (
... in case people want to be fully explicit or want to use a receiver type other than My main hesitation here and with AEHJL in general is using the same identifier |
I find option L to be quite problematic, for two main reasons: First, the distinction between Second, the primary purpose of option L seems to be to enable callsites to be agnostic about whether a parameter is passed by value or by pointer. I think this is an important problem that is well worth solving, but option L solves it only for the special case of self parameters, and I don't see how we could extend or generalize it to cover other function parameters, either now or in the future. |
There has been a lot of discussion in open sessions here, and I think there is a certain amount of consensus emerging:
I think the above have growing alignment, but if anyone disagrees, chime in. Beyond this, I think there are two big questions we keep circling around. First: where does the parameter go after the name? I think there are really three clear options here:
I suspect choosing between these three is something the leads will need to do, likely in conjunction with #565. The second remaining question (with my examples using (2) above just because that's what I've been using most recently): how do we handle different ways in which the object parameter might be passed? We've been currently orienting around immutable values (which can't even have their address taken) and pointers, with potential to expand later. There seem to be two options here:
I lean towards (a), and potentially adding some support for a pattern to bind after a dereference but not removing the fundamental fact that the parameter is a pointer and the However, @zygoloid has argued for (b) because it seems awkward for the only thing to select between implicitly taking the address or not be whether the object parameter of the method looked up is a pointer (let me know if I've gotten this wrong). For example, if we want to allow deduction of the object parameter type, this deduction can't be used to select between pointer and not pointer. While I'm somewhat nervous about (b), and I actually rather like (a) in several respects, I do understand the concern here. Ultimately, I think either (a) or (b) would be fine, and we should really pick one sooner for now. Maybe (b) is the least bad option. The other options here I think largely go down the path of putting (nearly) the full object category into the type system with something like references, which is complexity that I'd very much like to avoid at this stage and so that's why I'm leaning towards either (a) or (b). |
The problem that we are trying to solve here is that we want to be able to pass We could quibble about whether That being the case, I think the key distinction between (a) and (b) is that (b) can be plausibly generalized to arbitrary function parameters, whereas generalizing (a) in the same way would be highly problematic (to put it mildly). And it would be very poor ergonomics to have some way of opting into pass-by-reference for other parameters that was different from the way of opting into pass-by-reference for
It seems to me that (a) is very much going down the path of putting the full object category into the type system: it is using a parameter's type to signify whether the function has access to the corresponding argument object, or only the argument's value (and that's precisely why generalizing it would be so problematic). It may only be taking one step down that path, because it's limited to As a syntactic side note, we seem to be moving toward a convention where the nature of a variable binding is determined by an initial keyword, as with |
It seems desirable to me that the object parameter uses the same passing and matching rules as any other parameter, in both the call site and the function declaration, to the extent possible: a call Option (a) addresses this in the function definition: if the way we write functions that mutate caller-owned state is by accessing that state with a pointer in the function definition, then that's how we should mutate a caller-owned object parameter. Option (b) addresses this in the interaction between the caller and callee: if we want to be able to write a method such that A hybrid option (c) could give us both of these at once: we could reflect exactly how the parameter is passed (the
Having said that, I think it would make sense to add dedicated keywords for the various forms of argument passing semantics that we want (eg,
|
I actually really like this. I had similar thoughts, but kept ending up in a less good place. Your example here:
I'm actually pretty happy with this. Especially the simplification that the And @geoffromer is right -- this really is encoding the value category into the type system really explicitly -- L-values in the type system are pointers and somewhat have to be here. I still think its worth seeing how far we can get with the syntactic difference here and just taking the address of the impliict object parameter when the method clearly asks us to. |
I think the only major reservation I have about this approach is that in order to figure out the argument type that a function expects, it's not sufficient to look at the parameter type -- you also have to look at the parameter's introducer. However, I think that problem won't really come into focus until we allow
That somehow makes me feel like we're on the right track. So this approach seems fine for present purposes (i.e. the self-only case), and plausible if not outright intriguing as a basis for future generalization.
I don't think we're really "encoding the value category into the type system" here, we're just saying that the address-of operation only works on lvalues, and the dereference operation always produces lvalues, both of which we basically already knew. In any event, the key point is that we're not using the type system to express the requirement that the argument expression is an lvalue (or the fact that its address is taken implicitly). |
I think with the resolution of #565, this question has also converged on @zygoloid's suggestion:
We can (and should) still stay open to exploring other argument passing paradigms including things like I'm going to close this out for now as I think this has pretty clearly settled amongst the leads at this point, but happy to re-open if needed or problems arise. =] I'm also incorporating this into a proposal already, if the structs proposal doesn't get there first. |
Carbon has been using a `me: Self` or `addr me: Self*` deduced parameter to mark a function as a method as decided in #494 and implemented in #722 . This does not match existing languages, and so this proposal switches `me` to `self` to match Python, Rust, and Swift. Co-authored-by: Richard Smith <richard@metafoo.co.uk>
Assuming a function syntax like:
(The introducer
fn
is the subject of #463 .) What should the syntax for methods be? Methods use a different calling syntax (x.F(n)
) and need to distinguish between taking the receiver (x
) by value or by pointer.There are 5 things to decide:
And here are our options:
-1-
the introducer:A. The same introducer as functions (
fn
,func
, orfunction
as determined by #463)B. A new introducer, like
method
.-2-
brackets around the receiver declaration:C.
(
...)
Parenthesis, suggesting a parameter listD.
[
...]
Square brackets, suggesting an implied or different kind of parameterE. Omitted
-3-
receiver type:F.
Self
or pointer toSelf
(however we decide to write that,Self*
orPtr(Self)
)G. Something shorter like
Me
or pointer toMe
(however we decide to write that,Me*
orPtr(Me)
)H. Omitted
-4-
receiver name:I. An identifier specifying the name to bind the receiver to
J. Omitted, the receiver will use some reserved word
-5-
:K. A dot
.
L. A dot
.
if the receiver is passed by value, an arrow->
if the receiver is passed by pointerM. Omitted
Examples:
BCFIM:
method (Self* this) Set(Int n);
ADGJK:
fn [Me*].Set(Int n);
AEHJL:
fn ->Set(Int n);
Note: One of the options F, G, or L is needed to disambiguate calling by value vs. pointer.
Alternative: Following C++'s "Deducing
this
" proposal, we could also consider marking the first parameter with a keyword to indicate it is the receiver. That proposal usesthis
, as in:(This alternative was suggested by @tkoeppe .)
The text was updated successfully, but these errors were encountered: