Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split_closure: Improve unsafe-correctness + minor improvements #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

danielhenrymantilla
Copy link

split_closure does not need to be marked unsafe: one cannot cause unsoundness with the obtained raw pointer and unsafe fn function pointer, unless further unsafe is used.

Hence removing the unsafe-ness (to call) of split_closure.

That being said, the pattern unsafe { unsafe_fn_ptr(void_ptr, <args...) } is, on the other hand, expected to be sound, at least if done within the lifetime of the borrow over C.

This is "problematic" because Split itself is not marked unsafe, so once could go and write a troll / evil implementation of Split, with which the aformentioned pattenr would be unsound.

The obvious solution would be to have Split become an unsafe trait, but after thinking about it, I didn't like that approach: given the lack of variadic generics, it is almost impossible to explicitly name / describe within the type system that the Trampoline is expected to be some unsafe fn pointer, which ought to be extracted out of an FnMut closure, and which must be sound to call with an unaliased raw pointer to that closure. Because of that, the other approach is to simply Seal the trait within the privacy boundary, so that all the logic related to the trait is guaranteed to be self-contained within this one file.


Also, I've removed the unneeded Ret param of Split, and made obtaining the trampoline const.

`split_closure` does not need to be marked `unsafe`: one cannot cause unsoundness with the obtained raw pointer and `unsafe fn` function pointer, unless further `unsafe` is used.

Hence removing the `unsafe`-ness (to call) of `split_closure`.

That being said, the pattern `unsafe { unsafe_fn_ptr(void_ptr, <args...) }` is, on the other hand, expected to be sound, at least if done within the lifetime of the borrow over `C`.

This is "problematic" because `Split` itself is not marked `unsafe`, so once could go and write a troll / evil implementation of `Split`, with which the aformentioned pattenr would be unsound.

The obvious solution would be to have `Split` become an `unsafe trait`, but after thinking about it, I didn't like that approach: given the lack of variadic generics, it is almost impossible to explicitly name / describe within the type system that the `Trampoline` is expected to be some `unsafe fn` pointer, which ought to be extracted out of an `FnMut` closure, and which must be sound to call with an unaliased raw pointer to that closure. Because of that, the other approach is to simply `Seal` the trait within the privacy boundary, so that all the logic related to the trait is guaranteed to be self-contained within this one file.

___

Also, I've removed the unneeded `Ret` param of `Split`, and made obtaining the `trampoline` `const`.
///
/// - The call must be performed within the same thread, unless `C` (_i.e._,
/// the environment captured by the closure) is `Send`.
pub fn split_closure<'lifetime, C, Args>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back, I don't think we can actually do anything to make sure the *mut c_voidreturned by this function doesn't outlive the&'lifetime mut Cpassed in. Is'lifetime` just meant as a hint to the caller that they need to worry about lifetimes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kind of. It's so that the documention that mentions 'lifetime leads to no ambiguity. I guess a wrapper pointer-with-lifetime type here would make sense indeed.

///
/// - The call must be performed within the same thread, unless `C` (_i.e._,
/// the environment captured by the closure) is `Send`.
pub fn split_closure<'lifetime, C, Args>(
Copy link
Owner

@Michael-F-Bryan Michael-F-Bryan May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one cannot cause unsoundness with the obtained raw pointer and unsafe fn function pointer, unless further unsafe is used

If a function has a # Safety section stating several invariants that must be upheld, shouldn't it be unsafe in turn, even if the function's body doesn't do anything unsafe?

Another thing... I think we've actually created aliased mutable pointers here because the borrow checker will think the &'lifetime mut C borrow is over by the time split_closure() returns. There's nothing linking 'lifetime and the *mut c_void... Is that an unsafe operation, or just plain UB?

Copy link
Author

@danielhenrymantilla danielhenrymantilla May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure from being used in between split_closure and the usage of the pointer.

let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure);
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env, ...); } // OK
stuff(&closure); // `&closure` aliases `env`, so `env` is no longer unique and gets invalidated;
unsafe { cb(env, ...); } // UB

A way to "soothe" this would be through a pointer-with-lifetime wrapper:

mod boundary {
    use ::core::{marker::PhantomData, ptr};

    /// Imagine this is an UB-free `&'lifetime mut c_void`.
    pub
    struct ErasedMutRef<'lifetime> {
        ptr: ptr::NonNull<c_void>,
        _lifetime: PhantomData<&'lifetime mut ()>,
    }

    impl ErasedMutRef<'_> {
        pub
        fn as_ptr (self: &'_ Self) -> *mut c_void
        {
            self.ptr.as_ptr()
        }
    }

    impl<'lifetime, T> From<&'lifetime mut T> for ErasedMutRef<'lifetime> {
    {
        fn from (it: &'lifetime mut T) -> ErasedMutRef<'lifetime>
        {
            ErasedMutRef { ptr: ptr::NonNull::from(it).cast(), _lifetime: PhantomData }
        }
    }
}
pub use boundary::ErasedMutRef;

And then make split_closure<'lifetime, C> return
(ErasedMutRef<'lifetime>, unsafe extern fn ...),
to lead to the following pattern:

let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure); // <- borrow starts here...
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env.as_ptr(), ...); } // OK
stuff(&closure); // Error, `closure` accessed while borrowed.
unsafe { cb(env.as_ptr(), ...); } // <- ... borrow later used here

If a function has a # Safety section stating several invariants that must be upheld, shouldn't it be unsafe in turn, even if the function's body doesn't do anything unsafe?

🤷
Rust made the choice to have raw pointers be safe to create, and unsafe to use. Really, if a function returns a raw pointer or an unsafe fn (funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable). I have named such explanation # Safety, but feel free to rename it, e.g., # Sound usage of the returned values / pointers, especially if it can make all this easier to understand by a reader of the documentation.

The other option, is to mark it unsafe, in the same way that raw pointers are !Sync and !Send, as a lint (with some comment saying that this function is not unsafe per se, but the usage of the obtained poitners is, and the correctness of such usage is linked to this call site (c.f., exampel above)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure from being used in between split_closure and the usage of the pointer.

Rust made the choice to have raw pointers be safe to create, and unsafe to use. Really, if a function returns a raw pointer or an unsafe fn (funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable).

You raise a good point here. It's the responsibility of whatever uses these raw pointers to make sure they aren't aliased, so it's perfectly safe for split_closure() to return a raw pointer.

Incidentally, I've started using the following pattern to side-step the problem of misuse triggering aliasing.

fn get_trampoline<F>(_closure: &F) -> pcmcat_api_callback_t
where
    F: Fn(CallbackData<'_>) + Sync,
{
    unsafe extern "C" fn trampoline<T>(
        user_data: *mut c_void,
        data: *mut pcmcat_api_callback_data_t,
    ) where
        T: Fn(CallbackData<'_>),
    {
        debug_assert!(!user_data.is_null());
        debug_assert!(!data.is_null());
        let data = CallbackData::from(&*data);

        log::debug!("Received an event, {:?}", data);

        let closure = &mut *(user_data as *mut T);
        closure(data);
    }

    trampoline::<F>
}

And it's used like this:

// first we move the closure to the heap (Box<C>)
let mut on_event = Box::new(callback);
// then we get a function that can be used to call the callback
let cb = get_trampoline(&*on_event);
// and pass that raw pointer and callback to the open() call
let ret = pc_mcat_sys::pcmcat_api_open(
    cb,
    &mut *on_event as *mut _ as *mut c_void,
);

The gist of it is that we take an &F to get the closure's type, then just return a trampoline instantiated for the closure. The &mut F is only converted to a *mut c_void when we make the call.

I feel like this pattern would be better than the original split_closure() because it's up to the caller to create the *mut c_void pointer.

That ErasedMutRef<'lifetime> is also quite an elegant approach. It would make use the borrow checker to ensure the closure pointer isn't used between split_closure() and the final call site, and borrow checker errors are somewhat reassuring to have when you're playing around with unsafe... It also avoids needing to type out the double pointer cast, which is annoying and often forces a function call to go over multiple lines 😛

type Trampoline;

fn trampoline() -> Self::Trampoline;
const TRAMPOLINE: Self::Trampoline;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using an associated const instead of a function to get the trampoline. It makes it obvious that determining the corresponding trampoline function is something that can be done at compile time 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants