Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement function delegation in rustc #3530

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Nov 15, 2023

Summary

This RFC proposes a syntactic sugar for delegating implementations of functions to other already implemented functions.

There were two major delegation RFCs in the past, the first RFC in 2015 (#1406) and the second one in 2018 (#2393).

The second RFC was postponed by the language team in 2021 (#2393 (comment)).
We hope to revive that work again.

How this proposal is different from the previous ones:

  • This proposal follows the "prototype first, finalized design later" approach, so it's oriented towards compiler team as well, not just language team.
    The prototyping is already in progress and we are ready to provide resources for getting the feature to production quality if accepted.
  • This proposal takes a more data driven approach, and builds the initial design on relatively detailed statistics about use of delegation-like patterns collected from code in the wild. The resulting design turns out closer in spirit to the
    original proposal by @contactomorph than to later iterations.

This proposal is also the subject of an experimental feature gate: rust-lang/rust#117978.

Rendered

petrochenkov added a commit to petrochenkov/rust that referenced this pull request Nov 16, 2023
In accordance with the [process](https://github.com/rust-lang/lang-team/blob/master/src/how_to/experiment.md).

Detailed description of the feature can be found in the RFC repo - rust-lang/rfcs#3530.

TODO: find a lang team lang-team liaison.
Add a possible use of `#[refine]`
Fix a dead link
@joshtriplett joshtriplett changed the title eRFC: Implement function delegation in rustc Implement function delegation in rustc Nov 22, 2023
@traviscross traviscross added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 22, 2023
@traviscross
Copy link
Contributor

@petrochenkov: In the T-lang meeting, some people who had read the RFC were feeling good about it, and there was some interest in proposing FCP merge on this as a normal RFC in addition to going forward with the experiment while the FCP is pending.

However, some text in the body of the RFC and in the PR description here describes this RFC as an experimental one, which, as you note, is no longer a thing.

If you're ready for this RFC to move forward as a normal RFC, if you could, please remove the language about the RFC being experimental.

@Kobzol
Copy link

Kobzol commented Nov 23, 2023

My 2 cents (as a maintainer of https://crates.io/crates/delegate, which is probably the most used crate for delegation, in combination with https://crates.io/crates/ambassador):

There is a lot of various configuration options/knobs and variants of delegation that are important to different users, which can be seen from the myriad of options implemented in the delegate crate over the years (and there are still some use-cases in our issue tracker that haven't been even implemented yet).

Here is an example of a few such things:

  • Should the forwarding functions be inline or not? Should they be marked with some other custom attributes?
  • Should their return values or input parameters be coerced somehow? Into, TryInto, AsRef etc.? Should the result be unwrapped?
  • Do I want to change the delegated function name?
  • Should additional expressions be passed to the forwarded methods?
  • Should await be called on the forwarded methods?

It's probably not practical to add support for all these use-cases on the language level. A language level solution would hopefully solve the most common use-cases, and ideally also provide some extension points to go beyond these basic use-cases, but that might be quite challenging. If the language won't solve the problems of Rust users, they will just go back to using a third-party crate. I think that for gauging which problems are actually the most important, it might be better to take a look at the crates that use delegate or ambassador, and examine their usage patterns, rather than scanning the source code and trying to derive possible delegation patterns from that (although having this code-mining platform available will be definitely incredibly useful too!).

From my view, the most important aspect of delegation (that is now easily possible to perform with third-party crates at the moment) is the automatic enumeration of things that should be delegated (e.g. the signatures of methods of a trait), rather than the "forwarding" itself. It's quite common to delegate a trait impl to a field of a structure (or to a field of each enum variant), but since we don't have the ability to query the existing signatures, we have to both enumerate the method names, and also repeat their signatures.

Thus for me, a solution to delegation in Rust could be to implement a different language feature - some form of reflection that would give us the ability to code-gen the signatures of methods based on their names, and also the signatures of all methods of a trait. If this was available, any third-party crate (such as delegate) would be able to implement pretty much any delegation pattern without any further support from the language. Right now, delegate doesn't know the signatures of trait methods, so the user has to repeat the signatures, which is quite annoying. Repeating only the names of the functions would be already much better. With some form of reflection being available, we could provide the low-level building block in the language (and reflection is also useful for many other things), but leave the many details to third-party crates that are much easier to iterate upon than the language itself.

That being said, if we decide to go with the "delegation in language" way, I think that it should support the automatic enumeration of trait methods, and perhaps as an MVP also just enable delegating traits with some simple syntax. I like how it works in Kotlin, where you specify the interface (so a trait) that you want to delegate, and the expression/field that you want to forward that interface to. From my experience at least, this is the most common use-case for delegation.

So e.g. it would be great if something like this was possible to do with "language-level delegation":

struct Wrapper(Foo);

impl Bar for Wrapper by self.0 {
  // optionally allow to override the generated forwarding implementations
}

petrochenkov added a commit to petrochenkov/rust that referenced this pull request Nov 23, 2023
@ericsampson
Copy link

I think that for gauging which problems are actually the most important, it might be better to take a look at the crates that use delegate or ambassador, and examine their usage patterns, rather than scanning the source code and trying to derive possible delegation patterns from that.

I agree, that sounds like an additional useful source of data!

@petrochenkov, do you have any interest in the following approach? I suppose they’re not mutually exclusive—there could be language support for the common delegation scenarios, and eventually also add reflection support to enable crates to “fill in the rest” of the more complex/niche functionality. Plus, better reflection support would unlock a lot of other functionality for Rust.

Thus for me, a solution to delegation in Rust could be to implement a different language feature - some form of reflection that would give us the ability to code-gen the signatures of methods based on their names, and also the signatures of all methods of a trait

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 23, 2023
…shtriplett

Add an experimental feature gate for function delegation

In accordance with the [process](https://github.com/rust-lang/lang-team/blob/master/src/how_to/experiment.md).

Detailed description of the feature can be found in the RFC repo - rust-lang/rfcs#3530.

TODO: find a lang team liaison - https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/fn.20delegation.20liaison/near/402506959.
@petrochenkov
Copy link
Contributor Author

petrochenkov commented Nov 23, 2023

@traviscross
I've removed the process disclaimer.
It's probably fine to treat this as a normal compiler & language RFC for an experimental feature, the text is relatively detailed.

But it will certainly need a second iteration with finalized syntax and other design choices before any attempts at stabilization.
I also didn't add some clauses typical for proper language RFCs, like guide-level explanation, high-level alternatives (e.g. reflection as mentioned by other commenters) or drawbacks.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Nov 23, 2023

@Kobzol @ericsampson
I did look at delegate, ambassador and everything else in literature before collecting my statistics, that knowledge sort of guided which statistics to collect.

Regarding the specific suggestions:

Do I want to change the delegated function name?

This is supported - renaming.

Should the forwarding functions be inline or not? Should they be marked with some other custom attributes?

Extra attributes can be added because delegation item is still an item, and items may have attributes.
inline is an open question (front matter).
It should probably be added by default, but overridable.

just enable delegating traits with some simple syntax.
From my experience at least, this is the most common use-case for delegation.

Yep, the statistics also show that it's common, just not overwhelmingly common, see the numbers in postponed features.
So we'll certainly try to do it, but it is postponed as a second layer of sugar, until we implement the basic layer.

  • Should their return values or input parameters be coerced somehow? Into, TryInto, AsRef etc.? Should the result be unwrapped?
  • Should additional expressions be passed to the forwarded methods?
  • Should await be called on the forwarded methods?

That all falls under arbitrary argument pre-processing and result post-processing and therefore not supported due to not fitting into syntactic budget for a built-in feature - rejected features, syntactic budget.

If some cases are supported by the same method as SameUpToSelfType then it's good (that may be possible for the From/Into/AsRef/AsMut-like stuff), otherwise not supported.
The data collection pass can actually be extended to check which specific kinds or pre- and post-processing are common.

I guess there is a general "workaround" for a case in which most of the method body can be delegated - make a private delegated function, and a real function that calls the delegated one and does the necessary pre-post-processing.

@petrochenkov
Copy link
Contributor Author

Regarding reflection, I'm interested in seeing proposals, but I'm not sure it's even implementable in a general enough form with Rust type system, and the readiness timeline will be 5-10 years at least.
Even this proposal with its signature copying may open sort of a can of worms. to which rustc is I'm not sure is ready. We'll see, I guess.

@Kobzol
Copy link

Kobzol commented Nov 23, 2023

Regarding reflection, I'm interested in seeing proposals, but I'm not sure it's even implementable in a general enough form with Rust type system, and the readiness timeline will be 5-10 years at least.

Yes, sadly I also think that reflection is way off. So while I think that it would be a more general solution, if we want to get delegation faster, it makes sense to handle it specifically, and not wait for reflection.

@ericsampson
Copy link

ericsampson commented Nov 23, 2023

Yeah for sure.

Just for clarity, I am very excited that this is being worked on, so thanks a ton!!!

The RFC is well-written, and I really appreciate the data-driven approach.

Cheers 😊

| Arg0Preproc.RefField | 136 | `&(mut) arg0.field` |

`self.field` is not even the most common transformation, usually the first argument is not
transformed at all, although delegation for static methods and free functions skews the statistics
Copy link
Member

@RalfJung RalfJung Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a "static method"? Do you mean associated functions without self? I don't think I have seen them called "static" in Rust before. They surely don't have anything do do with static which is a keyword in Rust after all.

I always thought that's what the method vs function distinction is about: methods have self, functions do not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my knowledge, "functions" is a group that is made up of the disjoint groups "functions that take self" and "functions that do not take self". The former are called "methods" and the latter has no short name, although some other languages call a similar grouping "static methods".

Notably, both fn foo() and fn bar(self) use the fn keyword, which AFAIK means "function".

Copy link
Member

@RalfJung RalfJung Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so "functions" is the union of "methods" and "not-methods" then, I see.

With a definition like

Associated functions whose first parameter is named self are called methods

I would argue a "static method" ought to be a kind of method, but a function that does not take self is not a method, so calling it a "static method" is very confusing. I was surely confused. ;)

Java calls them "static methods" but they also use the static keyword to declare them so that makes some amount of sense. Our static keyword has a different meaning though so IMO it makes little sense for us. In Java static methods are statically dispatched while regular methods are dynamically dispatched; that does not apply for us either.

Copy link
Contributor Author

@petrochenkov petrochenkov Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's just my C++/Java past leaking, the terminology seems so natural I didn't realize it's not often used in Rust since it doesn't use static for this purpose.
Grepping the rust-lang/rust repo for "static method" shows that quite a few people stepped into the same issue though.

It's curious that you had nothing against newtypes, which are also never mentioned in the official docs :D

Copy link

@SOF3 SOF3 Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Java static methods are statically dispatched while regular methods are dynamically dispatched; that does not apply for us either.

Is that what static is supposed to mean? In Java, what affects whether the dispatch is dynamic would be final rather than static, and static not being dynamically dispatched is just a consequence of not having a receiver to dispatch on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what static is supposed to mean?

No idea, it's just my mental model. I can't otherwise explain why these would be called static methods.

But in Java they have to be static methods since Java only has methods, not functions. In Rust, calling them any kind of method makes little sense, you can't even use method call syntax to call them.

Copy link
Member

@RalfJung RalfJung Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's curious that you had nothing against newtypes, which are also never mentioned in the official docs :D

That's a standard term across many language communities. It also doesn't involve any unrelated keywords or so that would make in unsuited as a Rust term.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "static method" also exists in many languages… probably in most OOP languages. Some of those languages also support functions that are not methods, such as C++, Python, and JavaScript.

In those three languages, a method is a function associated with a type, and a static method is a method that does not have a self/this argument.

None of those languages have anything like Rust traits, so the analogy is not precise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, it's just my mental model. I can't otherwise explain why these would be called static methods.

Historically speaking, Java got the term static from C++, and in C++ dynamic dispatch is not the default, requiring a separate keyword (virtual); so that can't have been the original motivation for the term. I have no idea why they picked static, but it probably helped that static was already a keyword in C, albeit with an unrelated meaning.

(In truth, both C and C++ are guilty of overloading static to mean several different things depending on context.)

@joshtriplett
Copy link
Member

joshtriplett commented Dec 5, 2023

If we end up doing data analysis of broader crates.io, I wonder how many of the ArgsPreproc.Other cases involve Pin? Delegating traits like AsyncRead/AsyncWrite currently sometimes requires invocations of Pin::new.

Those aren't likely to come up in your current data set, though.

@petrochenkov
Copy link
Contributor Author

Design question: how exactly the body template is duplicated for list or glob delegation

Typically the target expression (i.e. to what we delegate) in a delegation item is very simple,
like.

reuse a::b { self.field }
// or
reuse b::c { self.getter() }

However, it's possible that it may contain some entities with identity, like items or closures.
I would expect anonymous constants, closures and imports to be most useful in this context.

reuse prefix::{a, b, c} {
    use some::import; // import
    (
        self.field.map(|x| x.y), // closure
        [10; SIZE], // Anonymous constant
    )
}

This poses a question - how are all these entities duplicated when we desugar the list delegation
prefix::{a, b, c} or glob delegation prefix::* into multiple actual functions

fn a { ... }
fn b { ... }
fn c { ... }

Will there be 3 separate imports, closures and anonymous constants with their own identities,
or there will be only one instance, and the generated functions will refer to it?

Due to implementation details of rustc (how DefId tree is built), the answer dictates the
compilation stage at which the desugaring happens.

Alternative 1: Duplication at token / AST level

Every generated function will contain its own version of items/closures/constants/etc.

This is closest to what would happen if we wrote the function bodies manually instead of delegating.

Each item clone will have its own identity - its own DefId.
That means the desugaring of list and glob delegations into single delegations must happen before
definition collector runs, that means during macro expansion and import resolution, not even during
AST -> HIR lowering or right before it.

For glob delegation (reuse prefix::*) it means that prefix must be resolvable early, but that
seems fine because we are only going to realistically support glob delegation to traits
(reuse Trait::*).

Both list and glob delegation, including target expression cloning, essentially become macro
features.
That means we don't even need to parse the target block contents as Rust code if the delegation
list is empty (prefix::{} { random garbage tokens }), or if the glob refers to a trait without
items (EmptyTrait::* { random garbage tokens }).
It still probably makes sense to parse it (but not name-resolve it), then such code will be
equivalent to code under #[cfg(FALSE)] or any other code dropped by a macro.

Unfortunate detail: prefix still needs to be resolved for empty delegations, and therefore
stability checked as well. It means "delegation stems" will need to be preserved in HIR to reach
stability checking, similarly to import stems.

Drawback of this approach: name resolution will run multiple times on every copy of the target
block, even if there are no items inside it and all the results are the same.
Possible solution: better name resolution caching, not just for delegation, but for everything.

Alternative 2: "Semantic" duplication

Every generated function will contain references to items/closures/constants/etc canonically defined
in the single delegation item block.

Item definitions will be parented under DefId of the delegation item itself, not under DefIds
of functions that it produces.

I suppose this can work for imports, and other item definitions (e.g. structs).

However, I'm not sure how this is going to work for e.g. closures.
Type checking results for different functions generated from a delegation item may be different,
that's a significant design point.
Even in simplest cases different bodies may differ in their use Deref or DerefMut, for example.
Closure signatures are also produced by type inference, so in theory they may differ too, but that's
impossible if all the closures have a single identity.

With this approach the block body will be parsed and name resolved once, even if the delegation
item is "empty" and produces no functions.

This approach also seems to just difficult to implement, for no good reason, especially for things
like closures that are actually a part of the executable code, can capture local variables, etc.

Alternative 3: Prohibit everything with identity in target blocks

This is actually combinable with both Alternative 1 and Alternative 2, we can emit a hard error or
a feature gate for this case until a practical case requiring it arises.

However, we still need to make a choice between 1 and 2 because the implementation strategy depends
on it.
We also still need to decide how much checking is performed for target blocks or "empty"
delegations.

The choice

I suggest selecting the Alternative 1.

It does make delegation sort of a macro feature to a larger degree, but makes life easier in all
other regards, and keeps the "desugaring is identical to manually written code" property.

bors added a commit to rust-lang-ci/rust that referenced this pull request May 15, 2024
delegation: Implement list delegation

```rust
reuse prefix::{a, b, c};
```

Using design described in rust-lang/rfcs#3530 (comment) (the lists are desugared at macro expansion time).
List delegations are expanded eagerly when encountered, similarly to `#[cfg]`s, and not enqueued for later resolution/expansion like regular macros or glob delegation (rust-lang#124135).

Part of rust-lang#118212.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Jun 19, 2024
delegation: Implement glob delegation

Support delegating to all trait methods in one go.
Overriding globs with explicit definitions is also supported.

The implementation is generally based on the design from rust-lang/rfcs#3530 (comment), but unlike with list delegation in rust-lang#123413 we cannot expand glob delegation eagerly.
We have to enqueue it into the queue of unexpanded macros (most other macros are processed this way too), and then a glob delegation waits in that queue until its trait path is resolved, and enough code expands to generate the identifier list produced from the glob.

Glob delegation is only allowed in impls, and can only point to traits.
Supporting it in other places gives very little practical benefit, but significantly raises the implementation complexity.

Part of rust-lang#118212.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Jun 19, 2024
delegation: Implement glob delegation

Support delegating to all trait methods in one go.
Overriding globs with explicit definitions is also supported.

The implementation is generally based on the design from rust-lang/rfcs#3530 (comment), but unlike with list delegation in rust-lang#123413 we cannot expand glob delegation eagerly.
We have to enqueue it into the queue of unexpanded macros (most other macros are processed this way too), and then a glob delegation waits in that queue until its trait path is resolved, and enough code expands to generate the identifier list produced from the glob.

Glob delegation is only allowed in impls, and can only point to traits.
Supporting it in other places gives very little practical benefit, but significantly raises the implementation complexity.

Part of rust-lang#118212.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jun 19, 2024
Rollup merge of rust-lang#124135 - petrochenkov:deleglob, r=fmease

delegation: Implement glob delegation

Support delegating to all trait methods in one go.
Overriding globs with explicit definitions is also supported.

The implementation is generally based on the design from rust-lang/rfcs#3530 (comment), but unlike with list delegation in rust-lang#123413 we cannot expand glob delegation eagerly.
We have to enqueue it into the queue of unexpanded macros (most other macros are processed this way too), and then a glob delegation waits in that queue until its trait path is resolved, and enough code expands to generate the identifier list produced from the glob.

Glob delegation is only allowed in impls, and can only point to traits.
Supporting it in other places gives very little practical benefit, but significantly raises the implementation complexity.

Part of rust-lang#118212.
@petrochenkov
Copy link
Contributor Author

petrochenkov commented Jun 28, 2024

Design question: what does the block around the target expression mean?

Suppose we have two delegation items

reuse just_expr { self.0 }

reuse multi_statements {
    let x = something;
    self.get(x)
}

What is the meaning of the curly braces around self.0 - is it actually a block expression or just a syntax that looks like a block expression.
In the just_expr case it's likely the latter, and in the multi_statements case it's likely the former (but there are nuances).

Single expression

It's clear that if we implemented some different syntax, e.g. reuse PATH from EXPR, then the just_expr example would be written as

reuse just_expr from self.0;

and not as

reuse just_expr from { self.0 };

The block expression here would be not just noisy, but harmful in the common case.
For example, the body generated for this code could fail borrow checking (autoref/deref is assumed).

just_expr(&{ self.0 }) // ERROR cannot move out of `self`

Multiple statements (alternative 1)

The second example with the alternative syntax would look like this.

reuse multi_statements from {
    let x = something;
    self.get(x)
}

And the generated body would look like this.

multi_statements({ let x = something; self.get(x)})

The block expression is clearly necessary here.

Multiple statements (alternative 2)

An alternative body desugaring for the multi-statement case could be

let x = something;
multi_statements(self.get(x))

This could potentially be better for borrow checking, but it needs to be proven with some practical cases.

The catch is that it no longer fits into some alternative expression only syntax like reuse PATH from EXPR, because the block here is clearly not a block expression.
If we need to actually delegate to a block expression we will need "double blocking" reuse foo { { bar } }.

The choice

Not clear yet.
The current implementaion uses the "alternative 2" desugaring, but strips the block if it contains only a single expression, which is a bit hacky.

@A1-Triard
Copy link

A1-Triard commented Sep 20, 2024

We need a way to delegate everything that doesn't have a default implementation. Right now, we have a wildcard delegation syntax that delegates every function, whether it has a default implementation or not. For one of the main purposes of delegation — namely, to simulate OOP — this behavior is not very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants