Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split_closure: Improve unsafe-correctness + minor improvements #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 27 additions & 11 deletions src/split.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,37 +38,53 @@ use std::ffi::c_void;
///
/// # Safety
///
/// The returned function can only be called with the returned pointer, or a
/// pointer to another `C` closure.
pub unsafe fn split_closure<C, Args, Ret>(
closure: &mut C,
/// - The returned function can only be called with the returned pointer, or a
/// pointer to another `C` closure.
///
/// - Such call is only valid within the `'lifetime` of the borrow over the
/// `closure`, during which such pointer cannot be aliased (_e.g._, no
/// concurrent calls to the closure).
///
/// - The call must be performed within the same thread, unless `C` (_i.e._,
/// the environment captured by the closure) is `Send`.
pub fn split_closure<'lifetime, C, Args>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back, I don't think we can actually do anything to make sure the *mut c_voidreturned by this function doesn't outlive the&'lifetime mut Cpassed in. Is'lifetime` just meant as a hint to the caller that they need to worry about lifetimes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kind of. It's so that the documention that mentions 'lifetime leads to no ambiguity. I guess a wrapper pointer-with-lifetime type here would make sense indeed.

Copy link
Owner

@Michael-F-Bryan Michael-F-Bryan May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one cannot cause unsoundness with the obtained raw pointer and unsafe fn function pointer, unless further unsafe is used

If a function has a # Safety section stating several invariants that must be upheld, shouldn't it be unsafe in turn, even if the function's body doesn't do anything unsafe?

Another thing... I think we've actually created aliased mutable pointers here because the borrow checker will think the &'lifetime mut C borrow is over by the time split_closure() returns. There's nothing linking 'lifetime and the *mut c_void... Is that an unsafe operation, or just plain UB?

Copy link
Author

@danielhenrymantilla danielhenrymantilla May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure from being used in between split_closure and the usage of the pointer.

let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure);
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env, ...); } // OK
stuff(&closure); // `&closure` aliases `env`, so `env` is no longer unique and gets invalidated;
unsafe { cb(env, ...); } // UB

A way to "soothe" this would be through a pointer-with-lifetime wrapper:

mod boundary {
    use ::core::{marker::PhantomData, ptr};

    /// Imagine this is an UB-free `&'lifetime mut c_void`.
    pub
    struct ErasedMutRef<'lifetime> {
        ptr: ptr::NonNull<c_void>,
        _lifetime: PhantomData<&'lifetime mut ()>,
    }

    impl ErasedMutRef<'_> {
        pub
        fn as_ptr (self: &'_ Self) -> *mut c_void
        {
            self.ptr.as_ptr()
        }
    }

    impl<'lifetime, T> From<&'lifetime mut T> for ErasedMutRef<'lifetime> {
    {
        fn from (it: &'lifetime mut T) -> ErasedMutRef<'lifetime>
        {
            ErasedMutRef { ptr: ptr::NonNull::from(it).cast(), _lifetime: PhantomData }
        }
    }
}
pub use boundary::ErasedMutRef;

And then make split_closure<'lifetime, C> return
(ErasedMutRef<'lifetime>, unsafe extern fn ...),
to lead to the following pattern:

let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure); // <- borrow starts here...
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env.as_ptr(), ...); } // OK
stuff(&closure); // Error, `closure` accessed while borrowed.
unsafe { cb(env.as_ptr(), ...); } // <- ... borrow later used here

If a function has a # Safety section stating several invariants that must be upheld, shouldn't it be unsafe in turn, even if the function's body doesn't do anything unsafe?

🤷
Rust made the choice to have raw pointers be safe to create, and unsafe to use. Really, if a function returns a raw pointer or an unsafe fn (funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable). I have named such explanation # Safety, but feel free to rename it, e.g., # Sound usage of the returned values / pointers, especially if it can make all this easier to understand by a reader of the documentation.

The other option, is to mark it unsafe, in the same way that raw pointers are !Sync and !Send, as a lint (with some comment saying that this function is not unsafe per se, but the usage of the obtained poitners is, and the correctness of such usage is linked to this call site (c.f., exampel above)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure from being used in between split_closure and the usage of the pointer.

Rust made the choice to have raw pointers be safe to create, and unsafe to use. Really, if a function returns a raw pointer or an unsafe fn (funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable).

You raise a good point here. It's the responsibility of whatever uses these raw pointers to make sure they aren't aliased, so it's perfectly safe for split_closure() to return a raw pointer.

Incidentally, I've started using the following pattern to side-step the problem of misuse triggering aliasing.

fn get_trampoline<F>(_closure: &F) -> pcmcat_api_callback_t
where
    F: Fn(CallbackData<'_>) + Sync,
{
    unsafe extern "C" fn trampoline<T>(
        user_data: *mut c_void,
        data: *mut pcmcat_api_callback_data_t,
    ) where
        T: Fn(CallbackData<'_>),
    {
        debug_assert!(!user_data.is_null());
        debug_assert!(!data.is_null());
        let data = CallbackData::from(&*data);

        log::debug!("Received an event, {:?}", data);

        let closure = &mut *(user_data as *mut T);
        closure(data);
    }

    trampoline::<F>
}

And it's used like this:

// first we move the closure to the heap (Box<C>)
let mut on_event = Box::new(callback);
// then we get a function that can be used to call the callback
let cb = get_trampoline(&*on_event);
// and pass that raw pointer and callback to the open() call
let ret = pc_mcat_sys::pcmcat_api_open(
    cb,
    &mut *on_event as *mut _ as *mut c_void,
);

The gist of it is that we take an &F to get the closure's type, then just return a trampoline instantiated for the closure. The &mut F is only converted to a *mut c_void when we make the call.

I feel like this pattern would be better than the original split_closure() because it's up to the caller to create the *mut c_void pointer.

That ErasedMutRef<'lifetime> is also quite an elegant approach. It would make use the borrow checker to ensure the closure pointer isn't used between split_closure() and the final call site, and borrow checker errors are somewhat reassuring to have when you're playing around with unsafe... It also avoids needing to type out the double pointer cast, which is annoying and often forces a function call to go over multiple lines 😛

closure: &'lifetime mut C,
) -> (*mut c_void, C::Trampoline)
where
C: Split<Args, Ret>,
C: Split<Args>,
{
(closure as *mut C as *mut c_void, C::trampoline())
(closure as *mut C as *mut c_void, C::TRAMPOLINE)
}

use private::Sealed;
mod private {
pub trait Sealed<Args> {}
}

/// A helper trait used by [`split_closure()`] to get a trampoline function
/// which will invoke the closure.
///
/// This trait is automatically implemented for any `FnMut()` callable, you
/// shouldn't implement it yourself.
pub trait Split<Args, Ret> {
pub trait Split<Args> : Sealed<Args> {
type Trampoline;

fn trampoline() -> Self::Trampoline;
const TRAMPOLINE: Self::Trampoline;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using an associated const instead of a function to get the trampoline. It makes it obvious that determining the corresponding trampoline function is something that can be done at compile time 👍

}

macro_rules! impl_split {
($( $outer:ident ),* ; $( $inner:ident ),*) => {
impl<Func, Ret, $($outer),*> Split<($( $outer, )*), Ret> for Func
impl<Func, Ret, $($outer),*> Sealed<($( $outer, )*)> for Func
where
Func: FnMut($($outer),*) -> Ret,
{}
impl<Func, Ret, $($outer),*> Split<($( $outer, )*)> for Func
where
Func: FnMut($($outer),*) -> Ret,
{
type Trampoline = unsafe extern "C" fn(*mut c_void, $($outer),*) -> Ret;

fn trampoline() -> Self::Trampoline {
const TRAMPOLINE: Self::Trampoline = {
// declare a trampoline function which will turn our pointer
// back into an `F` and invoke it

Expand All @@ -87,7 +103,7 @@ macro_rules! impl_split {
}

trampoline::<Func, Ret, $($outer,)*>
}
};
}
};
}
Expand Down