Skip to content

fn_cast! macro #140803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Darksonn opened this issue May 8, 2025 · 11 comments
Open

fn_cast! macro #140803

Darksonn opened this issue May 8, 2025 · 11 comments
Labels
A-control-flow-integrity Area: Control Flow Integrity (CFI) security mitigation A-rust-for-linux Relevant for the Rust-for-Linux project A-sanitizers Area: Sanitizers for correctness and code quality C-discussion Category: Discussion or questions that doesn't represent real issues. I-lang-nominated Nominated for discussion during a lang team meeting. P-lang-drag-2 Lang team prioritization drag level 2.https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang. PG-exploit-mitigations Project group: Exploit mitigations T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@Darksonn
Copy link
Contributor

Darksonn commented May 8, 2025

Since Rust 1.76 we document that it's valid to transmute function pointers from one signature to another as long as their signatures are ABI-compatible. However, we have since learned that these rules may be too broad and allow some transmutes that it is undesirable to permit. Specifically, transmutes that change the pointee type or constness of a pointer argument are considered ABI-compatible, but they are rejected by the CFI sanitizer as incompatible. See rust-lang/unsafe-code-guidelines#489 for additional details and #128728 for a concrete issue.

This issue tracks a proposed solution to the above: Introduce a new macro called fn_cast! that allows you to change the signature of a function pointer. Under most circumstances, this is equivalent to simply transmuting the function pointer, but in some cases it will generate a new "trampoline" function that transmutes all arguments and calls the original function. This allows you to perform such function casts safely without paying the cost of a trampoline when it's not needed.

The argument to fn_cast!() must be an expression that evaluates to a function item or a non-capturing closure. This ensures that the compiler knows which function is being called at monomorphization time.

As a sketch, you can implement a simple version of the macro like this:

macro_rules! fn_cast {
    ($f:expr) => {
        #[cfg(not(any(sanitize = "cfi", sanitize = "kcfi")))]
        {
            // we need $f coerced to a function pointer
            core::mem::transmute::<fn(_) -> _, _>($f)
        }
        
        #[cfg(any(sanitize = "cfi", sanitize = "kcfi"))]
        {
            |arg| {
                let arg = core::mem::transmute(arg);
                let ret = $f(arg);
                core::mem::transmute(ret)
            }
        }
    };
}

This implementation should get the point across, but it is incomplete for a few reasons:

  • It assumes that the function takes one argument, but a real fn_cast! should be improved to work with functions of any arity.
  • With CFI, it always generates a trampoline using a closure. However, if this was a compiler built-in, then it could modify the list of signatures allowed by the target function so that CFI does not reject the call. The trampoline would only be needed if the function is in a different compilation unit.
  • With KCFI, we can't add signatures to the target function, but we still don't always need a trampoline. For example, changing fn(&T) to fn(*const T) is allowed because &T and *const T is treated the same by KCFI. The compiler could detect such cases and emit a transmute instead of a trampoline.

By adding this macro, it becomes feasible to make the following breaking change to the spec:

When you make a function call, then the caller and callee must agree on what the function signature is exactly. Otherwise:

  • If the signatures are ABI-compatible, then it is EB (errornours behavior). That is, similiarly to integer overflow, sanitizers such as cfi, kcfi, or miri could trigger an error when it happens. But otherwise the call is allowed through by transmuting each argument.
  • Otherwise, it is UB (undefined behavior).

Here, the change is that ABI-compatible calls are considered EB. However, even without the spec change the macro is useful because it would allow for a more efficient implementation of #139632 than what is possible today.

This proposal was originally made as a comment. I'm filing a new issue because T-lang requested that I do so during the RfL meeting 2025-05-07.

@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. A-control-flow-integrity Area: Control Flow Integrity (CFI) security mitigation A-rust-for-linux Relevant for the Rust-for-Linux project A-sanitizers Area: Sanitizers for correctness and code quality C-discussion Category: Discussion or questions that doesn't represent real issues. I-lang-nominated Nominated for discussion during a lang team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels May 8, 2025
@RalfJung
Copy link
Member

RalfJung commented May 8, 2025

When you make a function call, then the caller and callee must agree on what the function signature is exactly.

So in such a world, the docs for the macro would say that this generates a new function? Because otherwise it seems like this list here has to account for the macro as well.

The macro needs to be unsafe of course, since function arguments are still being transmuted. We could have the macro ensure that the signatures are ABI-compatible -- but this can only be fully checked during monomorphization.

@Darksonn
Copy link
Contributor Author

Darksonn commented May 8, 2025

Well, yes it semantically creates a new function even if it has the same address. How exactly we word that is up to debate. I guess we might not want provenance for function pointers (?), so if fn_cast! returns a fn pointer with the same address, then we probably have to say that this function is valid to call with those two signatures.

@RalfJung
Copy link
Member

RalfJung commented May 8, 2025

I guess we might not want provenance for function pointers (?)

I mean, we could.^^ But yeah it's probably better if we avoid using provenance wherever possible.

@rcvalle rcvalle added the PG-exploit-mitigations Project group: Exploit mitigations label May 9, 2025
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label May 14, 2025
@traviscross traviscross added the P-lang-drag-2 Lang team prioritization drag level 2.https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang. label May 16, 2025
@traviscross
Copy link
Contributor

The macro needs to be unsafe of course, since function arguments are still being transmuted.

We have unsafe function pointers. So I wonder, should the macro call be unsafe, or should we be returning an unsafe fn(..) -> _?

We could have the macro ensure that the signatures are ABI-compatible -- but this can only be fully checked during monomorphization.

If we were to do these checks, I wonder whether we might want to support this as a coercion or as cast. E.g., we of course allow:

let _: *const () = &();

Does it make any sense to allow?:

let _: unsafe fn(*const ()) = |&()| ();

@RalfJung
Copy link
Member

I don't think I'd like to make as do even more things...

@Darksonn
Copy link
Contributor Author

The macro itself needs to be unsafe. Otherwise, how do people get a non-unsafe fn pointer? By transmuting the output of fn_cast!? The point of the macro is to get rid of the transmute.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented May 17, 2025

The macro needs to be unsafe of course, since function arguments are still being transmuted.

We have unsafe function pointers. So I wonder, should the macro call be unsafe, or should we be returning an unsafe fn(..) -> _?

While the function signature change itself can’t cause UB without the function being called, asking the call sites to justify the safety of the implied arg/result transmutes leads to somewhat silly consequences:

  1. The reason why the type punning is sound is usually the same at every call site, and competes for attention with other safety preconditions the underlying function may have, so you’ll basically always want to introduce a safe(r) wrapper if possible.
  2. Call sites don’t automatically know the underlying function, so they don’t even know what the type punning they have to justify is.
  3. In contrast, the code using fn_cast generally knows the original signature so it could document this (but then unsafe code elsewhere has to rely on this documentation!) or it could immediately create the safe wrapper itself (which is basically the same as if the fn_cast was unsafe to begin with).
  4. Having to define a wrapper function at all is undesirable - it produces more code and indirections that fn_cast was specifically designed to avoid when compiling without CFI.

It’s tempting to say: fn_cast is most useful for function pointers and you can just transmute from unsafe fn(..) -> _ to fn(..) -> _ if it’s still safe to call. But transmuting function pointers is precisely what fn_cast is supposed to replace! Of course it’s unlikely that some CFI scheme wants to consider safe/unsafe variants of the same signature to be incompatible. But it still sends a less consistent message to users and is easier to get wrong (accidentally change more about the signature than just the safety).

At the same time, there are cases where a safe function is type-punned into something that creates significant extra safety conditions for callers (e.g., type erasing fmt methods into fn(*const(), &mut Formatter) -> fmt::Result). For these cases, producing unsafe functions even from safe source functions is useful, and if it’s not done by fn_cast then the code using that macro again has to add wrappers.

So I think it’s probably most useful to consider “safe fn <-> unsafe fn” to be part of the type punning that fn_cast can do, and require unsafe for any invocation of fn_cast regardless of whether it produces a safe or unsafe function.

@traviscross
Copy link
Contributor

The point of the macro is to get rid of the transmute.

Interesting. That's not how I think about it. I think of the point of the built-in as being to do something a lot smarter than what's otherwise possible so as to support CFI, in terms of modifying the list of signatures when it can, generating and using a trampoline only when needed, etc. If it were just about getting rid of a transmute, I don't think we'd do this.

...and you can just transmute from unsafe fn(..) -> _ to fn(..) -> _ if it’s still safe to call.

Perhaps you could describe the use case you have in mind for when the cast function pointer will be safe to call. What's coming to my mind, in terms of practical use cases, are all ones where it would not be.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented May 17, 2025

Let me adjust my phrasing: yes, CFI compatibility is ultimately "the point" but I don't think this can be usefully separated from removing function pointer transmutes. To make CFI work, you need an intentional marker for "this specific function can also be called with this specific signature different from what its definition said" (which then enables e.g. generating the right trampoline if one is needed) rather than transmutes that leave you guessing whether the signature mismatch may be unintentional. Carving out a subset of such transmutes that are "still okay" after the introduction of fn_cast sounds like a bad idea: you'll still have calls not matching the callee signature, you're just hoping that this subset won't cause any problems.

Perhaps you could describe the use case you have in mind for when the cast function pointer will be safe to call. What's coming to my mind, in terms of practical use cases, are all ones where it would not be.

This example is a bit speculative for several reasons, but it's inspired by real code I'm working on. Consider a library that defines a trait for "fieldless #[repr(u8)] enum with consecutively numbered discriminants" as well as arrays/bitsets/maps generic over such enum types as array index, set element, or map key (there are several libraries like this on crates.io, mine isn't (yet)). Virtually all of the code in such a library could type-erase the enum and often also the number of variants (cf. core::array::from_fn and friends internally erasing the length), and this leads to a bunch of type-punning — including some type punning of function signatures that's safe without further consideration about the range of integers that will be passed through it. For example, if Array<I, T> is a glorified wrapper around [T; I::VARIANT_COUNT], then we might have something like the following:

impl<I: /* ... */, T> Array<I, T> {
    fn for_each_with_index_erased<F: FnMut(I, &T)>(f: F) {
        // SAFETY: `I` is a `repr(u8)` enum, so it's sound to transmute into u8
        for_each_with_index_raw(&self.0, unsafe { fn_cast!(f) })
    }
}

fn for_each_with_index_erased<T>(elems: &[T], f: fn(u8, &T)) {
    for (i, elem) in elems.iter().enumerate() {
        f(i as u8, elem);
    }
}

One reason this is speculative is I don't think there's a bound I could put on the type parameter F to make sure fn_cast can handle it (e.g., no captures and it's not a function pointer already). But this might be possible in the future, and even if not, it can be worked around by making the API much less ergonomic while keeping the safety relevant aspects intact.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented May 17, 2025

Another reason why the above example is speculative: the proposal says that the function being cast can’t have any captures if it’s a closure. It’s not clear to me why that restriction would be needed. Couldn’t the compiler generate another closure type that has the same captures, is ABI-compatible with the original closure type, and implement the appropriate Fn* traits for it by fn_cast-ing away the difference in receiver type and other parameters? The resulting closure type won’t be convertible to a function pointer either, but it could be used as trait object in the same way a fn_cast’d function pointer can be used for a manually constructed vtable.

Of course this doesn’t have to be part of the initial feature but I’d like to know if it’s possible in principle or if there’s a fundamental problem I’m missing.

@RalfJung
Copy link
Member

RalfJung commented May 18, 2025

In discussion with @Darksonn I toyed the idea that repr(transparent) could still be allowed to differ across caller and callee even without using the macro (i.e. Wrapper<T> on one side and T on the other) -- that is apparently trivial for CFI to handle, and it'd reduce the amount of churn needed in the ecosystem to adjust to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-control-flow-integrity Area: Control Flow Integrity (CFI) security mitigation A-rust-for-linux Relevant for the Rust-for-Linux project A-sanitizers Area: Sanitizers for correctness and code quality C-discussion Category: Discussion or questions that doesn't represent real issues. I-lang-nominated Nominated for discussion during a lang team meeting. P-lang-drag-2 Lang team prioritization drag level 2.https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang. PG-exploit-mitigations Project group: Exploit mitigations T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants