Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] externally implementable functions #3632

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented May 10, 2024

@m-ou-se m-ou-se added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 10, 2024
@m-ou-se m-ou-se changed the title Add RFC for externally implementable functions. [RFC] externally implementable functions. May 10, 2024
@m-ou-se m-ou-se changed the title [RFC] externally implementable functions. [RFC] externally implementable functions May 10, 2024
@jdonszelmann
Copy link

jdonszelmann commented May 10, 2024

How do you deal with the following case?

crate A defines an extern function f.

crate B imports A and implements f.
crate C imports A and implements f.

crate D imports B and C but cannot because both of them implement f and they conflict.

I agree conflicting implementations are a compiler error, but if libraries do it then you make some libraries mutually exclusive.

Edit: as I noted below, you might make it so B and C can both be imported as long as D also implements f

and outside (e.g. global logger). This just makes it much easier (and safer) to get right.

# Rationale and alternatives

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow grouping multiple functions together like global_allocator in this RFC? Or should that be left as future potential improvement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could work around that with a TAIT:

pub trait MyFunctions {
    fn fn1() -> String;
    fn fn2(a: String, b: u32);
}

pub type MyFunctionsImpl = impl MyFunctions;

fn f(v: Infallible) -> MyFunctionsImpl {
    my_functions(v)
}

pub extern impl fn my_functions(v: Infallible) -> impl MyFunctions;

pub fn fn3() -> String {
    MyFunctionsImpl::fn1()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that'd be part of a potential future (more compplicated) RFC, such as #2492

Copy link
Member

@RalfJung RalfJung May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly global_alloc could at least use the same internal mechanism, even if it's not visible to the user?

#[global_alloc]
static ALLOC: MyAlloc = ...;

could expand to something like

static ALLOC: MyAlloc = ...;

impl fn alloc::alloc::alloc(layout: Layout) -> *mut u8 {
  ALLOC.alloc(layout)
}
impl fn alloc::alloc::dealloc(layout: Layout) -> *mut u8 {
  ALLOC.dealloc(layout)
}
// ...

Then codegen and Miri would only have to support one such mechanism. :)

@m-ou-se
Copy link
Member Author

m-ou-se commented May 10, 2024

I agree conflicting implementations are a compiler error, but if libraries do it then you make some libraries mutually exclusive.

Yes, those would be mutually exclusive. Exactly how today panic-halt, panic-semihosting, and panic-reset are all mutually exclusive.

@jdonszelmann
Copy link

jdonszelmann commented May 10, 2024

Can libraries define a new default implementation of an extern function? Something like

// crate A:
extern impl fn logger() -> Logger {
    Logger::to_stdout().with_colors()
}

// crate B imports A:
extern impl fn a::logger() -> Logger {
    Logger::to_stderr().with_colors()
}

// crate C imports B:
impl fn b::logger() -> Logger {
    Logger::to_file("log.txt")
}

@m-ou-se
Copy link
Member Author

m-ou-se commented May 10, 2024

Can libraries define a new default implementation of an extern function?

No, I don't think we should do that. It'd be hard to define the priority of the multiple defaults.

@juntyr
Copy link
Contributor

juntyr commented May 10, 2024

How do you deal with the following case?

crate A defines an extern function f.

crate B imports A and implements f. crate C imports A and implements f.

crate D imports B and C but cannot because both of them implement f and they conflict.

I agree conflicting implementations are a compiler error, but if libraries do it then you make some libraries mutually exclusive.

Edit: as I noted below, you might make it so B and C can both be imported as long as D also implements f

On the one hand, I would hope that crates that implement externally implementable functions would be minimal, i.e. only have this implementation and nothing else, so that including both in the crate graph would always be a clear and understandable error (and not a mistake because you'd want functionality from both).

On the other hand, the example of logging shows that there are clear cases where you might want to combine different loggers into some super-logger that is specific for your use case. Logging implementation crates could then either ship a companion crate that only implements the function, or could have a non-default feature to enable the implementation.

@jdonszelmann
Copy link

Simply feature gating these implementations solves a lot of the problems I think.

Comment on lines 19 to 21
impl fn core::panic::panic_handler(_: &PanicInfo) -> ! {
loop {}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this particular syntax very much. It is too close to existing impl $t:ty { } syntax when $t is an fn type.

#![feature(rustc_attrs)]
impl fn(_: &core::panic::PanicInfo) -> ! {
    #[rustc_allow_incoherent_impl]
    pub fn what() {}
}

Granted, there is no ambiguity at the moment for the actual syntax proposed here since an fn type can't specify a name (right?) plus you aren't really allowed to impl on an fn type anyway unless you are the standard library.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that users aren't allowed to have an impl block for fn types anyway, and that the syntax is unambiguous regardless, I'm not too worried about this.

But I also don't care that much about the syntax. We can consider other syntaxes before stabilization of course.

@m-ou-se m-ou-se added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label May 10, 2024
@Diggsey
Copy link
Contributor

Diggsey commented May 10, 2024

If none are found, the result is either an error, or, if the extern impl fn has a default body, an implementation is generated that calls that default body.

I think the RFC needs to clarify how this works with different crate types. Presumably this check is not required when building a library crate (or else the feature would be useless). What about when building a dylib or cdylib?

Also, I think it would be good if the root (binary) crate could resolve the issue with multiple implementations by providing its own implementation.

@m-ou-se
Copy link
Member Author

m-ou-se commented May 10, 2024

I think it would be good if the root (binary) crate could resolve the issue with multiple implementations by providing its own implementation.

That's a possible extension. But that's not what we currently have for the panic handler or global allocator, right?

- The syntax re-uses existing keywords. Alternatively, we could:
- Use the `override` reserved keyword.
- Add a new (contextual) keyword (e.g. `existential fn`).
- Use an attribute (e.g. `#[extern_impl]`) instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the following alternative be mentioned / discussed:

  • multiple impl's are allowed
  • the root crate must import the impl they want
  • the normal-default impl is imported via the prelude

also just had the thought: does use crateA::different_name as panic_handler work similar to how (I believe) it works for main?

@burdges
Copy link

burdges commented May 10, 2024

Very nice RFC.

It's unlikely the root binary crate should provide these or micro-manage them too much.

It's most likely library crate features control these, which enables many options. If you do want micro-managment by the root binary crate then these could live in micro-crates too.

I'd think pub const and pub static would be the most likely future extensions here. pub fn groups could probably wait for more existential types discussions ala #2492

@m-ou-se
Copy link
Member Author

m-ou-se commented May 15, 2024

Process question for @rust-lang/lang: should this be an experimental feature gate instead of an RFC?

@joshtriplett
Copy link
Member

@m-ou-se Definitely not "instead of", but possibly "in addition to". This feature absolutely needs an RFC in order to be a future stable feature, but it could also be an experimental feature gate in order to implement it without waiting for the RFC to be merged.

That said, I really hope we accept and merge this soon, and I'm hoping we get to it in the next lang meeting. I would like to prioritize it.

@petrochenkov
Copy link
Contributor

At high level, I would expect something like this to be implemented using traits.

Basically we have

  • An interface defined in one place - possibly including multiple functions like with allocators, possibly having default implementaitons.
  • An implementation of that interface defined in a different place.

That screams "use traits for this", because all kinds of interfaces and implementations are always done through traits in Rust.
But the RFC defines something that is similar to trait impls, has its own set of surface rules similar to trait impls, but is not trait impls.

@m-ou-se
Copy link
Member Author

m-ou-se commented May 17, 2024

That screams "use traits for this"

Yeah, I agree that would be a good fit, but then we basically end up with #2492, which was not accepted at the time because it involved too much complexity.

I'd be happy if we could pick that route and make that all work. This RFC is just an attempt to do something much more basic to start with, since a much more complicated change like #2492 seems unlikely to work out any time soon.

@burdges
Copy link

burdges commented May 17, 2024

As written, linkers can do this resolution, no? Isn't that enough reason this should still exist, even if some non-linker-friendly trait based scheme emerges in 5 years or whatever?

@jdonszelmann
Copy link

Just like global registration I don't think you want to implement this feature through the linker, at the very least initially. Personally, I'd be worried about the errors that are generated and which are hard to bring to the same standard as other rust compiler errors. I also like this comment about it.

@bjorn3
Copy link
Member

bjorn3 commented May 18, 2024

I'd be worried about the errors that are generated and which are hard to bring to the same standard as other rust compiler errors.

My suggestion is to have the check for a downstream crate implementing it be in rustc, but the doing the actual tying up with the linker, like is currently done for #[panic_handler] and #[global_allocator] (for the latter the default is handled using a compiler generated shim, but when you actually use #[global_allocator] no shim is used at all.). In the end some form of tying up by the linker is necessary anyway unless you want to have a global constructor which at runtime sets a static to point to the external implementation of the function.

Co-authored-by: Josh Triplett <josh@joshtriplett.org>
@joshtriplett
Copy link
Member

@rfcbot resolve statically-unused-extern-impl-fn

Resolved by the use of cfg.

@petrochenkov
Copy link
Contributor

petrochenkov commented May 22, 2024

@m-ou-se

That screams "use traits for this"

Yeah, I agree that would be a good fit, but then we basically end up with #2492, which was not accepted at the time because it involved too much complexity.

I'd be happy if we could pick that route and make that all work. This RFC is just an attempt to do something much more basic to start with, since a much more complicated change like #2492 seems unlikely to work out any time soon.

I don't see why traits/impls used for this feature cannot be restricted to prohibit, for example, associated types or generics, if they complicate the minimal design.

@joshtriplett
Copy link
Member

I'd be happy to see an implementation using traits that is initially restricted in what can appear in the trait, if that makes it simple enough to ship.

I do hope eventually we can ship a version that allows, for instance, associated constants, and that we can use those associated constants in generic bounds on functions in the standard library. (For instance, conditionally providing impl From<u64> for usize.) However, I know that'd be more complex, and I think we should ship something without that support first if we can do that more easily.

@m-ou-se
Copy link
Member Author

m-ou-se commented May 22, 2024

I don't see why traits/impls used for this feature cannot be restricted to prohibit, for example, associated types or generics, if they complicate the minimal design.

So you're proposing basically #2492 with some restrictions to make the implementation much simpler?

If you have time to write down a more concrete proposal (not necessariliy an RFC, but some clear examples or something), that would be valuable.

@petrochenkov
Copy link
Contributor

petrochenkov commented May 22, 2024

I don't have a concrete proposal, just a general suggestion to resyntax this

// log crate:

extern impl fn logger() -> Logger {
    Logger::default()
}

// user:

impl fn log::logger() -> Logger {
    Logger::to_stdout().with_colors()
}

into something like

// log crate:

#[some_extern_impl_attribute]
trait LoggerInterface {
  fn create_logger() -> Logger {
    Logger::default() // either default body for the default
  }
}

// or a separate impl for the default
struct DefaultLoggerCreator;
#[maybe_some_other_extern_impl_attribute_if_really_necessary]
impl LoggerInterface for DefaultLoggerCreator {
  fn create_logger() -> Logger {
      Logger::default()
  }
}

// user:

struct MyLoggerCreator;
#[maybe_some_third_extern_impl_attribute_if_really_necessary]
impl log::LoggerInterface for MyLoggerCreator {
  fn create_logger() -> Logger {
      Logger::to_stdout().with_colors()
  }
}

EDIT: Not just resyntax, this should have all the usual trait semantics (safety, visibility, signature subtyping, etc) until we reach the codegen stage.

@joshtriplett
Copy link
Member

joshtriplett commented May 22, 2024

(Reiterating that all of this is speculation on a different proposal, not a blocker on this proposal.) I would generally expect that a trait-based proposal should separate the concept of implementing a trait from the concept of setting a specific implementer of that trait as a global. Or, in other words, something more like:

/// log

trait LoggerInterface {
  fn create_logger() -> Logger;
}

impl LoggerInterface for DefaultLoggerCreator {
    fn create_logger() -> Logger { ... }
}

pub extern type LoggerCreator: LoggerInterface = DefaultLoggerCreator;

// user code

impl log::LoggerInterface for MyLoggerCreator {
    fn create_logger() -> Logger { ... }
}

extern type log::LoggerCreator = MyLoggerCreator;

(With appropriate restrictions on what trait LoggerInterface can contain to make this implementable.)

@traviscross
Copy link
Contributor

We discussed this in the lang meeting today. We developed a consensus that this is addressing an important problem and one that we would like to solve.

In the meeting, there were various alterations and alternatives put forward, including by @Amanieu and @tmandry. We also wanted to cross-check this against the recenty-accepted RFC:

That RFC adopts a conceptual separation between the unsafety of declaring an extern block (and verifying that the signatures within are correct) and the unsafety of calling (or otherwise using) an extern item that may have other invariants that may need to be upheld. We just need to check that whatever we do here is consistent with that conceptually (maybe it already is).

While we allow some time for these things, in the interest of not blocking experimentation, we've decided to approve this to go forward as a lang experiment under our process for that. @joshtriplett has offered to be the liaison.

We've opened a tracking issue for this experiment here:

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

Since the next step here is to discuss the full set of options, let's nominate the tracking issue in place of this RFC.

@rustbot rustbot removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label May 22, 2024
@Amanieu
Copy link
Member

Amanieu commented May 22, 2024

I wrote up my alternative proposal in #3645, which is heavily based on this one. It keeps the basic idea of just having functions that defined downstream and resolved by the linker, but changes the syntax to look more like traits. This works better for things like the global allocator which consists of multiple functions and allows safety to be defined separately on the trait (unsafe to implement) and its functions (unsafe to call).

@tmandry
Copy link
Member

tmandry commented May 23, 2024

Move extern impls to blocks

My proposal is to move both the declarations and implementations into blocks. That would let us differentiate between functions that are unsafe to implement and functions that are unsafe to call. It would look something like this:1

// alloc::global:

extern unsafe impl {
    fn allocate(layout: Layout) -> Result<NonNull<[u8]>, AllocError>;
    unsafe fn deallocate(ptr: NonNull<u8>, layout: Layout);
}

// user:

// Note the use of a path here – already allowed, except it's a module not a type!
// This means we won't have to add a new case to the syntax, and keeps things nicely grouped together.
unsafe impl alloc::global {
    fn allocate(layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
        todo!()
    }
    
    unsafe fn deallocate(ptr: NonNull<u8>, layout: Layout) {
        todo!()
    }
}

From here we can, optionally, do the following:

Unify declarations with extern "Rust" {} blocks

We can make extern "Rust" {} work exactly like extern impl above. This includes preserving namespacing of function names, unlike the "C" ABI.

If we do this we should transition the default ABI inside extern {} blocks to be "Rust". This can be done over an edition.

Also as a delta to #3484, extern "Rust" {} blocks would not need unsafe. In fact, it wouldn't make sense to mark them as such, because the compiler checks the signatures for you. We would not want to use unsafe extern to mean "unsafe to override"; instead, we should keep the extern unsafe impl {} syntax.

The first proposal is forward-compatible with this one.

Footnotes

  1. As I prepare to post this I see that it's quite close to a proposal for extern mod in the other thread, though it isn't clear where to hang the unsafe for impls in that proposal.

@Jules-Bertholet
Copy link
Contributor

Unify declarations with extern "Rust" {} blocks

No, just because you happen to be using the Rust ABI for some extern declarations, you should not thereby be forced to adopt the new compiler checks? (For example, maybe you are linking to an object file that was compiled with the same version of rustc, but is otherwise completely opaque to you.

Also, this "checked extern" feature would IIUC be able to support fns with generic type and const parameters, which is not something today's extern blocks can support; which suggests to my mind that these are distinct and non-unifiable features, despite their similarity.

@tmandry
Copy link
Member

tmandry commented May 23, 2024

For the record I don't think extern "Rust" {} is useful for anything today (possibly due to a bug); it seems to rename the function according to the C ABI, but a matching definition does not. All my attempts to use it resulted in linker errors and the only results on Github are uses of the cxx crate.

For example, maybe you are linking to an object file that was compiled with the same version of rustc, but is otherwise completely opaque to you.

Given what I said above, that's a new feature. I would argue that the default Rust ABI should check externs and require an rmeta file at minimum. But we could add unsafe "unchecked" Rust externs in the future that work across, e.g., staticlib/cdylib boundaries.

Also, this "checked extern" feature would IIUC be able to support fns with generic type and const parameters, which is not something today's extern blocks can support; which suggests to my mind that these are distinct and non-unifiable features, despite their similarity.

I can see your argument here. Ultimately it is just a question of how much we decide to semantically group the features, despite various differences in their capabilities. It would be helpful to imagine how we might represent stable ABI boundaries and see how these would fit in (cc @Amanieu).

As I said in my comment above though, it would be fine to defer the question of unifying with extern "Rust" until later, since the first part of what I propose is forward compatible with it.

@Jules-Bertholet
Copy link
Contributor

For the record I don't think extern "Rust" {} is useful for anything today (possibly due to a bug); it seems to rename the function according to the C ABI, but a matching definition does not.

Even with #[no_mangle]? That sounds like a bug

@programmerjake
Copy link
Member

programmerjake commented May 23, 2024

Even with #[no_mangle]? That sounds like a bug

I think he's complaining that #[no_mangle] is required rather than that #[no_mangle] doesn't work...

@lolbinarycat
Copy link
Contributor

to conditionally provide make other functions

i assume this is a typo?

also, if i'm not mistaken, this same functionality can be provided with no_mangle and extern "Rust", cam it not?

@programmerjake
Copy link
Member

also, if i'm not mistaken, this same functionality can be provided with no_mangle and extern "Rust", cam it not?

if there's a default body: only if your platform happens to support weak symbols, and even then it's unsafe.


- Adding a syntax to specify an existing function as the impl. E.g. `impl core::panic_handler = my_panic_handler;`
- Doing this for `static` items too. (Perhaps all items that can appear in an `extern "Rust" { … }` block.)
- Using this for existing overridable global behavior in the standard library, like the panic handler, global allocator, etc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem for this RFC since it's just a future possibility, but I wanted to write this up somewhere before I forget it again since it's non-obvious and I haven't seen it spelled out before: the "only one impl allowed, except that the declaration can include a default body" semantics work fine for the panic handler but not for the global allocator. The panic handler is declared and used in core, std supplies a definition that can't be overridden once it's pulled in. In contrast, the global allocator is declared and used in alloc, but std effectively adds a "default" implementation that can be overridden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.