Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] externally implementable functions #3632

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
184 changes: 184 additions & 0 deletions text/0000-externally-implementable-functions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
- Feature Name: `extern_impl_fn`
- Start Date: 2024-05-10
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary

A mechanism for defining a function whose implementation can be defined (or overridden) in another crate.

Example 1:

```rust
// core::panic:

extern impl fn panic_handler(_: &PanicInfo) -> !;

// user:

impl fn core::panic::panic_handler(panic_info: &PanicInfo) -> ! {
eprintln!("panic: {panic_info:?}");
loop {}
}
```

Example 2:

```rust
// log crate:

extern impl fn logger() -> Logger {
Logger::default()
}

// user:

impl fn log::logger() -> Logger {
Logger::to_stdout().with_colors()
}
```

# Motivation

We have several items in the standard library that are overridable/definable by the user crate.
For example, the (no_std) `panic_handler`, the global allocator for `alloc`, and so on.

Each of those is a special lang item with its own special handling.
Having a general mechanism simplifies the language and makes this functionality available for other crates, and potentially for more use cases in core/alloc/std.

# Explanation

A function can be defined as "externally implementable" using `extern impl` as follows:

```rust
// In crate `x`:

// Without a body:
extern impl fn a();

// With a body:
extern impl fn b() {
println!("default impl");
}
```

Another crate can then provide (or override) the implementation of these functions using `impl fn` syntax (using their path) as follows:
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

```rust
// In another crate:

impl fn x::a() {
println!("my implementation of a");
}

impl fn x::b() {
println!("my implementation of b");
}
```

# Details
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

## Signature
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

It is an error to have a different signature for the `impl fn` item.

(Whether `#[track_caller]` is used or not is considered part of the signature here.)

## One impl

It is an error to have no `impl fn` item (in any crate) for an `extern impl fn` item without a body.

It is an error to have multiple `impl fn` items (across all crates) for the same `extern impl fn` item.

Note: This means that adding or removing an `impl fn` item is a semver incompatible change.

## Visibility

`extern impl fn` items can have a visibility specifier (like `pub`), which determines who can *call* the function (or create pointers to it, etc.).

*Implementing* the function can be done by any crate that can name the item.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify whether this is intended to also tie into visibility? For instance, a pub(crate) extern impl fn can only have an implementation provided by the crate, right?

Copy link
Member Author

@m-ou-se m-ou-se May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that if you do:

pub mod a {
    pub(crate) extern impl fn x();
}

The, other crates can provide an implementation (because a is pub, allowing them to name a::x), but they cannot call it (because the function is not public itself).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-ou-se That seems potentially confusing. Is there some way to set the visibility of being able to implement it? Is there value in being able to do so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the point of this feature is allowing other crates to implement it. So unless you want to propose a kind of visibility that includes some crates but not others, that just implies full public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the extern impl fn is in a private module, it is impossible to implement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-ou-se

The idea is that if you do:

pub mod a {
    pub(crate) extern impl fn x();
}

The, other crates can provide an implementation (because a is pub, allowing them to name a::x), but they cannot call it (because the function is not public itself).

Yeah this is very confusing. Could you make it respect the normal privacy rule and reuse e.g. #3323 to explicitly deny call/ref permission from dependencies?

pub mod a {
    pub restrict_use(crate) extern impl fn x();
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the extern impl fn is in a private module, it is impossible to implement?

It could be re-exported, but that reexport then presumably allows both calling and implementing the item. The magic behavior where these two permissions are different can only be obtain via the original definition of the item, IIUC.

(The `impl fn` item will need to name the item to implement, which could be directly or through an alias/re-export.)

# Implementation

The implementation will be based on the same mechanisms as used today for the `panic_handler` and `#[global_allocator]` features.

The compiler of the root crate will find the implementation of all externally implementable functions and give an error
if more than one implementation is found for any of them.
If none are found, the result is either an error, or—if the `extern impl fn` has a default body—an implementation
is generated that calls that default body.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

# Drawbacks

- It encourages globally defined behaviour.
- Counterargument: We are already doing this anyway, both inside the standard library (e.g. panic_handler, allocator)
and outside (e.g. global logger). This just makes it much easier (and safer) to get right.
- This will invite the addition of many hooks to the standard library to modify existing behavior.
While we should consider such possibilities, this RFC does not propose that every piece of standard library behavior should be replaceable.

m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
# Rationale and alternatives

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow grouping multiple functions together like global_allocator in this RFC? Or should that be left as future potential improvement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could work around that with a TAIT:

pub trait MyFunctions {
    fn fn1() -> String;
    fn fn2(a: String, b: u32);
}

pub type MyFunctionsImpl = impl MyFunctions;

fn f(v: Infallible) -> MyFunctionsImpl {
    my_functions(v)
}

pub extern impl fn my_functions(v: Infallible) -> impl MyFunctions;

pub fn fn3() -> String {
    MyFunctionsImpl::fn1()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that'd be part of a potential future (more compplicated) RFC, such as #2492

Copy link
Member

@RalfJung RalfJung May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly global_alloc could at least use the same internal mechanism, even if it's not visible to the user?

#[global_alloc]
static ALLOC: MyAlloc = ...;

could expand to something like

static ALLOC: MyAlloc = ...;

impl fn alloc::alloc::alloc(layout: Layout) -> *mut u8 {
  ALLOC.alloc(layout)
}
impl fn alloc::alloc::dealloc(layout: Layout) -> *mut u8 {
  ALLOC.dealloc(layout)
}
// ...

Then codegen and Miri would only have to support one such mechanism. :)

## Syntax

The syntax re-uses existing keywords. Alternatively, we could:
- Use the `override` reserved keyword.
- Add a new (contextual) keyword (e.g. `existential fn`).
- Use an attribute (e.g. `#[extern_impl]`) instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the following alternative be mentioned / discussed:

  • multiple impl's are allowed
  • the root crate must import the impl they want
  • the normal-default impl is imported via the prelude

also just had the thought: does use crateA::different_name as panic_handler work similar to how (I believe) it works for main?


## Functions or statics

This RFC only proposes externally implementable *functions*.

An alternative is to only provide externally definable *statics* instead.

That would be equivalent in power: one can store a function pointer in a static, and one can return a reference to a static from a function ([RFC 3635](https://github.com/rust-lang/rfcs/pull/3635)).

(Another alternative, of course is to provide both. See future possibilities.)

## Visibility

There are two kinds of visibilities to be considered for externally implementable functions:
who can *implement* the function, and who can *call* the function.

Not allowing the function to be implemented by other crates nullifies the functionality, as the entire point of externally implementable functions is that they can be implemented in another crate. This visibility is therefore always (implicitly) "pub".

Allowing a more restricted (that is, not `pub`) visibility for *calling* the function can be useful. For example, today's `#[panic_handler]` can be defined by any crate, but can not be called directly. (Only indirectly through `panic!()` and friends.)

A downside is that it is not possible to allow this "only implementable but not publicly callable" visibility through an alias.

An alternative could be to use the same visibility for both implementing an calling, which would simply mean that the function (or an alias to it) will always have to be `pub`.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
## Configuration

An `extern impl fn` may have `#[cfg(...)]` attributes applied to it as usual. For instance, a crate may only provide an `extern impl fn` with a given feature flag enabled, and might then use the same feature flag to conditionally provide make other functions depending on that `extern impl fn`. This is a useful pattern for crates that don't want to provide a default implementation but want to avoid producing a compilation error unless the function is needed.

# Prior art

[RFC 2494 "Existential types with external definition"](https://github.com/rust-lang/rfcs/pull/2492)
has been proposed before, which basically does this for *types*. Doing this for functions (as a start) saves a lot of complexity.

# Unresolved questions

- Should we provide a mechanism to set an `extern impl fn` using `=` from an existing `fn` value, rather than writing a body? For instance, `impl fn x::y = a::b;`
- Should we allow some form of subtyping, similarly to how traits allow trait impls to do subtyping?
- What should the syntax be once we stabilize this?
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
- How should this work in dynamic libraries?
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
- Should there be a way to specify that implementing the function is unsafe, separately from whether the function itself is unsafe?
- Should not having an implementation be an error when the function is never called?
- If we do end up designing and providing an `extern impl Trait` feature in addition to `extern impl fn`, should we *only* provide `extern impl Trait`, or is there value in still providing `extern impl fn` as well? This RFC proposes that we should still have `extern impl fn`, for the simpler case, rather than forcing such functions to be wrapped in traits.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
- An `extern impl fn` that's marked as `pub(crate)` but is nonetheless pub to *implement* could surprise people. Is there some way we can make this less surprising? Should we require that all `extern impl fn` have `pub` visibility?

# Future possibilities

- Adding a syntax to specify an existing function as the impl. E.g. `impl core::panic_handler = my_panic_handler;`
- Doing this for `static` items too. (Perhaps all items that can appear in an `extern "Rust" { … }` block.)
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved
- Using this for existing overridable global behavior in the standard library, like the panic handler, global allocator, etc.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem for this RFC since it's just a future possibility, but I wanted to write this up somewhere before I forget it again since it's non-obvious and I haven't seen it spelled out before: the "only one impl allowed, except that the declaration can include a default body" semantics work fine for the panic handler but not for the global allocator. The panic handler is declared and used in core, std supplies a definition that can't be overridden once it's pulled in. In contrast, the global allocator is declared and used in alloc, but std effectively adds a "default" implementation that can be overridden.

- We could add a mechanism for arbitrating between multiple provided implementations. For instance, if a crate A depended on B and C, and both B and C provide implementations of an `extern impl fn`, rather than an error, A could provide its own implementation overriding both.
- Using this mechanism in the standard library to make more parts overridable. For example:
- Allowing custom implementations of `panic_out_of_bounds` and `panic_overflowing_add`, etc.
(The Rust for Linux project would make use of this.)
- Allowing overriding `write_to_stdout` and `write_to_stderr`.
(This enables custom testing frameworks to capture output. It is also extremely useful on targets like wasm.)
- This could possibly be extended to groups of functions in the form of a `trait` that can be globally implemented.
(E.g. `extern impl AsyncRuntime`, to say that there must be a global implementation of that trait.)
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
- Given an `extern impl Trait` feature, could we provide a compatibility mechanism so that a crate providing an `extern impl fn` can migrate to an `extern impl Trait` in a compatible way, such that crates doing `impl fn` will still be compatible with the new `extern impl Trait`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One approach to compatibility would be to start with this syntax already, and just artificially restrict the trait used to only have Self-less associated functions, I believe that is isomorphic to this proposal of defining global functions, and could then just be extended to support more trait features in the future (eventually I assume any object-safe trait could be used).