Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing the representation for fn-item types based on their ABI #40744

Closed
nikomatsakis opened this issue Mar 22, 2017 · 26 comments
Closed
Labels
A-ffi Area: Foreign Function Interface (FFI) C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Mar 22, 2017

Currently, the types for fn-items are always zero-sized. As described in #19925, this was defined in RFC 401, but the implementation lagged quite a bit from 1.0, and as a result a number of crates had to be converted because they were relying on transmutes from fn-item types into fn pointers -- generally speaking, these pointers were used in FFI situations. We only recently ended the "future incompatibility" warning period and made this into a hard error.

Towards the end of that FCP, @wycats floated the interesting idea that we could make the representation of an fn item dependent on its declared ABI. In particular, while the type of a standard Rust fn foo() would still be a zero-sized type, the type of a extern "C" fn foo() (or, presumably, any ABI intended for FFI, which basically means everything but Rust) would change from being zero-sized to being a function pointer:

fn foo() { } // sizeof::<typeof(foo)>() == 0

extern "C" fn bar() { } // sizeof::<typeof(bar)>() == sizeof::<usize>()

Note that the fn item would still have a unique type, it would just not be zero-sized. Its value would be the same as an equivalent function pointer (in this case, extern "C" fn()). Because each fn item still has a unique type, it means that if you have a higher-order fn like fn map<T: Fn()>() and you invoke it with map(bar), we will still monomorphize a version of map that statically dispatches to bar. This was the primary motivation for RFC 401.

We could indeed go further, and just say that the size of any the type for any fn-item has the same representation as a fn-pointer, even if it is not precisely the same type. (This was the case before we made the changes that led to #19925, in fact.) This would mean that it is always safe to transmute a fn item to its corresponding fn() type.

Strictly speaking, changing anything here would be a breaking change from the current behavior, and we would want to test its impact. My hypothesis thought is that it would only make some crates work that are currently broken. It seems unlikely -- but not impossible -- that people are relying on the fact that fn types are currently zero-sized.

Nominating for @rust-lang/lang discussion. I'm not sure the best way to proceed with this idea and want to talk it over. My motivation here is both to "unbreak" a certain amount of crates and to potentially ease an ergonomic sore spot with FFI. I know that there is vociferous opposition from others, however, and @eddyb in particular doesn't want to encourage the use of transmute. This comment from @wycats is also worth reading.

@nikomatsakis nikomatsakis added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Mar 22, 2017
@eddyb
Copy link
Member

eddyb commented Mar 22, 2017

I was going to suggest replacing the cases where you'd cast function pointers in C with varargs coercions.
That is, when the signature isn't the same on every call, which is the case for e.g. some callbacks, you would take a regular Rust extern fn and pass it where, e.g. extern fn(i32, ...) is expected.
But then I was told that might be UB, so that's less appealing now, if that is indeed the case.

@nagisa
Copy link
Member

nagisa commented Mar 22, 2017

I would rather we kept uniformity here (between extern "Rust" and extern everything else).


C itself has a different "syntax" for function pointers (or more correctly its types):

void (*bar)(); // declaration of the bar function pointer

The equivalent of C’s

void (*bar)() = banana;

works just fine in rust:

let x: extern fn() = banana;

due to coercions. A common use case of passing a "callback" function should work just as well:

extern {
    fn qsort(base: *const (), nmemb: usize, size: usize, compar: extern fn(*const (), *const ()) -> i32);
}
extern fn comp(x: *const (), y: *const ()) -> i32 { 0 }
fn main() {
    qsort(::std::ptr::null(), 0, 0, comp)
}

So, I find wycat’s argumentation, or the need to transmute, mostly unconvincing.

@eddyb
Copy link
Member

eddyb commented Mar 22, 2017

@nagisa There are a few C APIs which need some sort of "opaque-yet-still-function" pointer.
If it's just the pointee type of one argument, that's easy, you cast that argument in the Rust function.

@petrochenkov
Copy link
Contributor

petrochenkov commented Mar 22, 2017

A possible better alternative to using transmute is to add a new intrinsic fn reify<T, U>(T) -> U where T is required to be a function item type and U - a corresponding fn pointer type.
Pros:

  • type inference for the output type still works
  • transmute is not encouraged
  • fn type sizes are still uniform, reify doesn't care about size equality
  • reify, unlike transmute, can be potentially marked as safe one day
  • there's a chance the "magic" requirements of reify like "T is a function item" will be expressible with traits one day

@eddyb
Copy link
Member

eddyb commented Mar 22, 2017

@petrochenkov I don't get the usecase though. If you unify (if-else, match or arrays) two different functions with the same signature, they will both reify-coerce, and if you're passing a function to something expecting a function pointer, it will reify-coerce yet again (this is actually older than the unify coercions).

The only real usecase of transmute on function pointers that I know of is when C expects something more vague than the function you have, which is where varargs, when they're not UB, would be perfect.

@petrochenkov
Copy link
Contributor

@eddyb
My assumption was that people actually want transmute for some reasons, so I proposed a replacement.
If transmute or equivalent is not wanted, then this whole issue can be closed as not-an-issue.

@eddyb
Copy link
Member

eddyb commented Mar 22, 2017

@petrochenkov The only use I've seen in the wild that can't be replaced with simply letting functions coerce to function pointers and possibly casting some pointers in the bodies of those functions, is that of incomplete function types in ABIs, i.e. where C more or less takes void(*)(), not void(*)(void).

@nagisa
Copy link
Member

nagisa commented Mar 23, 2017

So I've thought of "benefits" to keep all functions zero sized. I think these reasons are pretty convincing:

  • Passing sized functions will "use" the 8 bytes/register where none is necessary. This is at odds with rust philosophy of 0-cost;
  • Calling these functions from rust side will require indirect calls in all cases or, alternatively, with rigourous optimisation indirect calls in more non-obvious cases... also at odds with the philosophy.

Let me expand on the 2nd point. Provided you have a named extern function (currently 0 sized) on hand and cast some valid function pointer to said type with code like this:

// Pardon the typos, phone mode

fn c<U, T>(p: *mut *const U, ref: T) -> *mut T;
extern fn banana(){}
let myfn = *c(someptr, banana);
myfn()

Now currently you cannot pretend this function points to some pointer, because it's zero sized but if it was a pointer one could then argue that rust still calling banana instead of pointer is wrong. And to make that possible indirect calls would have to happen.

@nikomatsakis
Copy link
Contributor Author

We discussed this some in the @rust-lang/lang meeting. I want to lay out first a bit of where I am coming from. I don't actually care all that much about whether the fn type is zero-sized or not: I care about how ergonomic it is to pass fn ptrs into C code.

I feel that right now it can often be kind of a pain, but I'd like to come up with some convincing examples. I think part of the problem is that you wind up having to write out a lot of detail in your cast (e.g., x as fn(T1, T2)). The reify() idea that @petrochenkov might be an improvement here.

But even a bit larger, I guess, I really feel like the "unique type for a fn item" thing is a kind of confusion point that I'd prefer to avoid. That is, I think that was a good decision, and it fits neatly into the system, but the more we can make it so that you never have to know about it, the better. To that end, the reify() solution doesn't seem to get us as far as I'd hope to go.

Anyway, we also talked in the @rust-lang/lang meeting about how this is in the context of @wycats trying to "push down" the Rust types as close to the FFI as they can get them -- which I think is a laudable goal, in that it largely reduces boilerplate and improves finding errors. However, it does hit some nasty complications about things like painful ABI details and so forth.

This got us to wondering if we could improve the FFI ergonomics here specifically not through changing the language but rather through making more progress on the "bindgen-like" tool that has been floated from time to time. Specifically, a tool that helps you to auto-generate boilerplate wrappers and the like. So in this case I could imagine a wrapper that lets you give the "C type" of a function and also a Rust type, and kind of "auto-coerces" between the two to make it work. That's pretty vague, obviously, but seems interesting.

@nikomatsakis
Copy link
Contributor Author

@nagisa Ah, I remember now the reason I thought zero-sized types were a slam dunk -- at least for Rust functions. If you want (today) to store a closure in a struct, and you want to use a trait object, then you have to have the type Box<FnMut()>. If this is a true closure, that will require an allocation to store the environment, and there's not much you can do about it. But if it is a free function, since they are zero-sized, it will not. And that seemed like a really nice thing -- particularly since the pointer will not be used. And moreover iirc that same "won't allocate" property was true in the olden, olden days when coercing a fn item to a proc() (or however that type was written).

@joshtriplett
Copy link
Member

One other possibility, to simplify passing function pointers without using transmute or writing out the complete type: we could introduce a new trait implemented on all appropriate function types, providing a function that returns a sized function pointer. Then you could just call .fn_ptr() or similar in an FFI argument.

@solson
Copy link
Member

solson commented Mar 24, 2017

Whatever our solution here ends up being, I think it would be the wrong approach if it's specific to function item types. Passing a non-capturing closure (which on nightly is allowed to coerce to a function pointer) should be just as ergonomic.

@solson
Copy link
Member

solson commented Mar 24, 2017

@joshtriplett If I understand correctly, that's equivalent to @petrochenkov's reify idea.

@eddyb
Copy link
Member

eddyb commented Mar 24, 2017

I don't get the reify idea, functions do coerce to function pointer types.

I wish we hadn't allowed casts from function types to pointers and integers, because then we really wouldn't need ABI strings on anything except functions exported through #[no_mangle] / #[link_name = "..."], everything else could be done at coercion time.

@nikomatsakis
Copy link
Contributor Author

@eddyb

I wish we hadn't allowed casts from function types to pointers and integers, because then we really wouldn't need ABI strings on anything except functions exported through #[no_mangle] / #[link_name = "..."], everything else could be done at coercion time.

Can you elaborate?

@joshtriplett
Copy link
Member

@solson I assumed that such a trait would work for closures as well.

@retep998
Copy link
Member

@nikomatsakis

Can you elaborate?

It means I'd be able to define a single fn foo() { ... } and then pass it to anything that accepts a function pointer with those arguments regardless of calling convention. It could coerce to extern "cdecl" fn extern "system" fn extern "C" fn and so on, where rustc would monomorphize the function to the given calling convention. Essentially functions would be implicitly generic over the calling convention.

@nikomatsakis
Copy link
Contributor Author

Just to clear up something. @nagisa wrote:

Calling these functions from rust side will require indirect calls in all cases or, alternatively, with rigourous optimisation indirect calls in more non-obvious cases... also at odds with the philosophy.

But I don't think this is true. Just because the representation is a pointer, doesn't mean we need to do an indirect call -- after all, it's still a unique type, and hence it can only have one value.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Mar 24, 2017

@retep998 @eddyb

It means I'd be able to define a single fn foo() { ... } and then pass it to anything that accepts a function pointer with those arguments regardless of calling convention.

OK, then can you elaborate on how that would be possible? Would we be synthesizing all the wrappers we need statically, you're saying?

And then @eddyb is saying that, because you can coerce to an integer, this implies that it would be observable? If so, I don't see why that's such a big deal. We can say that when you coerce to an integer, you get the "Rust" ABI "wrapper" or something. (Or: the "default" wrapper, whichever that is; ie., we reinterpret extern "C" as making "C" the default.)

@solson
Copy link
Member

solson commented Mar 24, 2017

(We should also deprecate coercing functions to integers with a warning and remove it in Rust 2.0. 😉)

@arielb1
Copy link
Contributor

arielb1 commented Mar 24, 2017

@eddyb

I am quite sure extern "C" function pointers are necessary because they can be returned from C APIs (or present in structs returned from C APIs etc), and therefore have to be types at the Rust level anyway. And if we have function pointer ABIs, there's no real reason not to have function item ABIs (we could make all fn items extern "Rust" and have reification create a shim if needed, but I don't see any real reason - and this makes externs more complicated because you can't use type-based dispatch).

@eddyb
Copy link
Member

eddyb commented Mar 25, 2017

@arielb1 @nikomatsakis Others have explained it already but I missed a crucial point:
Right now the coercion from a closure to a function pointer only works with the Rust ABI.
I see no reason why that should be the case, and uniformity with function items led to my comment.

@alexcrichton
Copy link
Member

A recent cargobomb report had four regressions (here's one) related to the warning turning into a hard error in Rust 1.17.

@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 27, 2017
@steveklabnik
Copy link
Member

Triage: I remember this being a situation way back when, but it's been a year and a half. Should we do anything here?

@jonas-schievink jonas-schievink added the A-ffi Area: Foreign Function Interface (FFI) label Jun 13, 2020
@jonas-schievink
Copy link
Contributor

Triage: It doesn't seem like much is going to happen here, so maybe this should be closed?

@nikomatsakis
Copy link
Contributor Author

Agreed, we're not changing anything here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ffi Area: Foreign Function Interface (FFI) C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests