-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split_closure:
Improve unsafe
-correctness + minor improvements
#3
base: master
Are you sure you want to change the base?
split_closure:
Improve unsafe
-correctness + minor improvements
#3
Conversation
`split_closure` does not need to be marked `unsafe`: one cannot cause unsoundness with the obtained raw pointer and `unsafe fn` function pointer, unless further `unsafe` is used. Hence removing the `unsafe`-ness (to call) of `split_closure`. That being said, the pattern `unsafe { unsafe_fn_ptr(void_ptr, <args...) }` is, on the other hand, expected to be sound, at least if done within the lifetime of the borrow over `C`. This is "problematic" because `Split` itself is not marked `unsafe`, so once could go and write a troll / evil implementation of `Split`, with which the aformentioned pattenr would be unsound. The obvious solution would be to have `Split` become an `unsafe trait`, but after thinking about it, I didn't like that approach: given the lack of variadic generics, it is almost impossible to explicitly name / describe within the type system that the `Trampoline` is expected to be some `unsafe fn` pointer, which ought to be extracted out of an `FnMut` closure, and which must be sound to call with an unaliased raw pointer to that closure. Because of that, the other approach is to simply `Seal` the trait within the privacy boundary, so that all the logic related to the trait is guaranteed to be self-contained within this one file. ___ Also, I've removed the unneeded `Ret` param of `Split`, and made obtaining the `trampoline` `const`.
/// | ||
/// - The call must be performed within the same thread, unless `C` (_i.e._, | ||
/// the environment captured by the closure) is `Send`. | ||
pub fn split_closure<'lifetime, C, Args>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back, I don't think we can actually do anything to make sure the *mut c_voidreturned by this function doesn't outlive the
&'lifetime mut Cpassed in. Is
'lifetime` just meant as a hint to the caller that they need to worry about lifetimes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kind of. It's so that the documention that mentions 'lifetime
leads to no ambiguity. I guess a wrapper pointer-with-lifetime type here would make sense indeed.
/// | ||
/// - The call must be performed within the same thread, unless `C` (_i.e._, | ||
/// the environment captured by the closure) is `Send`. | ||
pub fn split_closure<'lifetime, C, Args>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one cannot cause unsoundness with the obtained raw pointer and unsafe fn function pointer, unless further unsafe is used
If a function has a # Safety
section stating several invariants that must be upheld, shouldn't it be unsafe
in turn, even if the function's body doesn't do anything unsafe
?
Another thing... I think we've actually created aliased mutable pointers here because the borrow checker will think the &'lifetime mut C
borrow is over by the time split_closure()
returns. There's nothing linking 'lifetime
and the *mut c_void
... Is that an unsafe
operation, or just plain UB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck
to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure
from being used in between split_closure
and the usage of the pointer.
let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure);
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env, ...); } // OK
stuff(&closure); // `&closure` aliases `env`, so `env` is no longer unique and gets invalidated;
unsafe { cb(env, ...); } // UB
A way to "soothe" this would be through a pointer-with-lifetime wrapper:
mod boundary {
use ::core::{marker::PhantomData, ptr};
/// Imagine this is an UB-free `&'lifetime mut c_void`.
pub
struct ErasedMutRef<'lifetime> {
ptr: ptr::NonNull<c_void>,
_lifetime: PhantomData<&'lifetime mut ()>,
}
impl ErasedMutRef<'_> {
pub
fn as_ptr (self: &'_ Self) -> *mut c_void
{
self.ptr.as_ptr()
}
}
impl<'lifetime, T> From<&'lifetime mut T> for ErasedMutRef<'lifetime> {
{
fn from (it: &'lifetime mut T) -> ErasedMutRef<'lifetime>
{
ErasedMutRef { ptr: ptr::NonNull::from(it).cast(), _lifetime: PhantomData }
}
}
}
pub use boundary::ErasedMutRef;
And then make split_closure<'lifetime, C>
return
(ErasedMutRef<'lifetime>, unsafe extern fn ...)
,
to lead to the following pattern:
let mut closure = |...| { ... };
let (env, cb) = split_closure(&mut closure); // <- borrow starts here...
// env ~ &mut closure, so it is sound to use, but `closure` must not be touche in the meantime
unsafe { cb(env.as_ptr(), ...); } // OK
stuff(&closure); // Error, `closure` accessed while borrowed.
unsafe { cb(env.as_ptr(), ...); } // <- ... borrow later used here
If a function has a # Safety section stating several invariants that must be upheld, shouldn't it be unsafe in turn, even if the function's body doesn't do anything unsafe?
🤷
Rust made the choice to have raw pointers be safe to create, and unsafe
to use. Really, if a function returns a raw pointer or an unsafe fn
(funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable). I have named such explanation # Safety
, but feel free to rename it, e.g., # Sound usage of the returned values / pointers
, especially if it can make all this easier to understand by a reader of the documentation.
The other option, is to mark it unsafe
, in the same way that raw pointers are !Sync
and !Send
, as a lint (with some comment saying that this function is not unsafe
per se, but the usage of the obtained poitners is, and the correctness of such usage is linked to this call site (c.f., exampel above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not UB, but misusage can cause UB: lifetimes are "just" a tool for borrowck to ensure a lack of such misusage (aliasing int his case). Indeed, the current pattern cannot prevent closure from being used in between split_closure and the usage of the pointer.
Rust made the choice to have raw pointers be safe to create, and unsafe to use. Really, if a function returns a raw pointer or an unsafe fn (funnily enough your function does both), I think the documentation has to explain when it is not unsound to use them (otherwise one must assume they unusable).
You raise a good point here. It's the responsibility of whatever uses these raw pointers to make sure they aren't aliased, so it's perfectly safe for split_closure()
to return a raw pointer.
Incidentally, I've started using the following pattern to side-step the problem of misuse triggering aliasing.
fn get_trampoline<F>(_closure: &F) -> pcmcat_api_callback_t
where
F: Fn(CallbackData<'_>) + Sync,
{
unsafe extern "C" fn trampoline<T>(
user_data: *mut c_void,
data: *mut pcmcat_api_callback_data_t,
) where
T: Fn(CallbackData<'_>),
{
debug_assert!(!user_data.is_null());
debug_assert!(!data.is_null());
let data = CallbackData::from(&*data);
log::debug!("Received an event, {:?}", data);
let closure = &mut *(user_data as *mut T);
closure(data);
}
trampoline::<F>
}
And it's used like this:
// first we move the closure to the heap (Box<C>)
let mut on_event = Box::new(callback);
// then we get a function that can be used to call the callback
let cb = get_trampoline(&*on_event);
// and pass that raw pointer and callback to the open() call
let ret = pc_mcat_sys::pcmcat_api_open(
cb,
&mut *on_event as *mut _ as *mut c_void,
);
The gist of it is that we take an &F
to get the closure's type, then just return a trampoline instantiated for the closure. The &mut F
is only converted to a *mut c_void
when we make the call.
I feel like this pattern would be better than the original split_closure()
because it's up to the caller to create the *mut c_void
pointer.
That ErasedMutRef<'lifetime>
is also quite an elegant approach. It would make use the borrow checker to ensure the closure pointer isn't used between split_closure()
and the final call site, and borrow checker errors are somewhat reassuring to have when you're playing around with unsafe
... It also avoids needing to type out the double pointer cast, which is annoying and often forces a function call to go over multiple lines 😛
type Trampoline; | ||
|
||
fn trampoline() -> Self::Trampoline; | ||
const TRAMPOLINE: Self::Trampoline; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using an associated const instead of a function to get the trampoline. It makes it obvious that determining the corresponding trampoline function is something that can be done at compile time 👍
split_closure
does not need to be markedunsafe
: one cannot cause unsoundness with the obtained raw pointer andunsafe fn
function pointer, unless furtherunsafe
is used.Hence removing the
unsafe
-ness (to call) ofsplit_closure
.That being said, the pattern
unsafe { unsafe_fn_ptr(void_ptr, <args...) }
is, on the other hand, expected to be sound, at least if done within the lifetime of the borrow overC
.This is "problematic" because
Split
itself is not markedunsafe
, so once could go and write a troll / evil implementation ofSplit
, with which the aformentioned pattenr would be unsound.The obvious solution would be to have
Split
become anunsafe trait
, but after thinking about it, I didn't like that approach: given the lack of variadic generics, it is almost impossible to explicitly name / describe within the type system that theTrampoline
is expected to be someunsafe fn
pointer, which ought to be extracted out of anFnMut
closure, and which must be sound to call with an unaliased raw pointer to that closure. Because of that, the other approach is to simplySeal
the trait within the privacy boundary, so that all the logic related to the trait is guaranteed to be self-contained within this one file.Also, I've removed the unneeded
Ret
param ofSplit
, and made obtaining thetrampoline
const
.