Skip to content

Commit

Permalink
POC for step 1 of mozilla#419 - replace Handlemaps with Arcs.
Browse files Browse the repository at this point in the history
The context for this is Ryan's [proposal for passing around object instances]( mozilla#419)

By ipmlementing `ViaFfi` for `Arc<T>` we get a huge amount of functionality
for free - `Arc` wrappers around Objects can suddenly appear in dictionaries
and function params, and be returned from functions once we remove the explicit
code that disallows it.

I kept support for both threadsafe and non-threadsafe interfaces, primarily
to avoid touching all the examples. This turned out to be quite easy in
general, because obviously `Arc<T>` ends up getting support for `Arc<Mutex<T>>`
for free. The mutex semantics for using these as params etc is almost certainly
going to do the wrong thing - eg, acquiring a mutex when the object is passed
as a param is going to end in tears at some stage.

It's also incomplete (needs docs, kotlin tests for the new test cases,
swift hasn't been touched at all, etc)
  • Loading branch information
mhammond committed May 17, 2021
1 parent 9aee208 commit f43e4b8
Show file tree
Hide file tree
Showing 12 changed files with 286 additions and 114 deletions.
145 changes: 89 additions & 56 deletions docs/manual/src/internals/object_references.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Managing Object References

Uniffi [interfaces](../udl/interfaces.md) represent instances of objects
that have methods and contain shared mutable state. One of Rust's core innovations
that have methods and contain state. One of Rust's core innovations
is its ability to provide compile-time guarantees about working with such instances,
including:

Expand All @@ -10,21 +10,33 @@ including:
active at any point in the program.
* Guarding against data races.

Uniffi aims to maintain these guarantees even when the Rust code is being invoked
from a foreign language, at the cost of turning them into run-time checks rather
than compile-time guarantees.
The very nature of the problems Uniffi tries to solve is that calls may come
from foreign languages on any thread. Uniffi itself tries to take a hands-off
approach as much as possible to allow the Rust compiler itself to ensure these
guarantees can be met - which in practice means all instances exposed by uniffi
must be, in Rust terminology, `Send+Sync`, and `&mut self` type params
typically can't be supported.

## Handle Maps
Typically this will mean your implementation uses some data structures
explicitly designed for this purpose, such as a `Mutex` or `RwLock` - but this
detail is completely up to you - as much as possible, uniffi tries to stay out
of your way, so ultimately it is the Rust compiler itself which is the ultimate
arbiter.

We achieve this by indirecting all object access through a
[handle map](https://docs.rs/ffi-support/0.4.0/ffi_support/handle_map/index.html),
a mapping from opaque integer handles to object instances. This indirection
imposes a small runtime cost but helps us guard against errors or oversights
in the generated bindings.
## Arcs

For each interface declared in the UDL, the uniffi-generated Rust scaffolding
will create a global handlemap that is responsible for owning all instances
of that interface, and handing out references to them when methods are called.
In order to allow for instances to be used as flexibly as possible, uniffi
works with `Arc`s holding a pointer to your instances and leverages their
reference-count based lifetimes, allowing uniffi to largely stay out
of handling lifetimes entirely for these objects.

However, this does come at a cost - when you want to return instances from
your dictionaries or methods, you must return an `Arc<>` directly. When
accepting instances as arguments, you can choose to accept it as an `Arc<>` or
as the underlying struct - there are different use-cases for each scenario.

The exception to the above is constructors - these are expected to just provide
the instance and uniffi will wrap it in the `Arc<>`.

For example, given a interface definition like this:

Expand All @@ -36,70 +48,91 @@ interface TodoList {
};
```

The Rust scaffolding would define a lazyily-initialized global static like:
On the Rust side of the generated bindings, the instance constructor will create an instance of the
corresponding `TodoList` Rust struct, wrap it in an `Arc<>` and return a raw
pointer to the foreign language code:

```rust
lazy_static! {
static ref UNIFFI_HANDLE_MAP_TODOLIST: ConcurrentHandleMap<TodoList> = ConcurrentHandleMap::new();
pub extern "C" fn todolist_12ba_TodoList_new(
err: &mut uniffi::deps::ffi_support::ExternError,
) -> *const std::os::raw::c_void /* *const TodoList */ {
uniffi::deps::ffi_support::call_with_output(err, || {
let _new = TodoList::new();
let _arc = std::sync::Arc::new(_new);
uniffi::UniffiVoidPtr(<std::sync::Arc<TodoList> as uniffi::ViaFfi>::lower(_arc))
})
}
```

On the Rust side of the generated bindings, the instance constructor will create an instance of the
corresponding `TodoList` Rust struct, insert it into the handlemap, and return the resulting integer
handle to the foreign language code:
and the uniffi runtime defines:

```rust
pub extern "C" fn todolist_TodoList_new(err: &mut ExternError) -> u64 {
// Give ownership of the new instance to the handlemap.
// We will only ever operate on borrowed references to it.
UNIFFI_HANDLE_MAP_TODOLIST.insert_with_output(err, || TodoList::new())
unsafe impl<T: Sync + Send> ViaFfi for std::sync::Arc<T> {
type FfiType = *const std::os::raw::c_void;
fn lower(self) -> Self::FfiType {
std::sync::Arc::into_raw(self) as Self::FfiType
}
}
```

When invoking a method on the instance, the foreign-language code passes the integer handle back
to the Rust code, which borrows a mutable reference to the instance from the handlemap for the duration
of the method call:
which does the "arc to pointer" dance for us. Note that this has "leaked" the
`Arc<>` reference - if we never see that pointer again, our instance will leak.

When invoking a method on the instance, the foreign-language code passes the
raw pointer back to the Rust code, which turns it back into a cloned `Arc<>` which
lives for the duration of the method call:

```rust
pub extern "C" fn todolist_TodoList_add_item(handle: u64, todo: RustBuffer, err: &mut ExternError) -> () {
let todo = <String as uniffi::ViaFfi>::try_lift(todo).unwrap()
// Borrow a reference to the instance so that we can call a method on it.
UNIFFI_HANDLE_MAP_TODOLIST.call_with_result_mut(err, handle, |obj| -> Result<(), TodoError> {
TodoList::add_item(obj, todo)
pub extern "C" fn todolist_12ba_TodoList_add_item(
ptr: *const std::os::raw::c_void,
todo: uniffi::RustBuffer,
err: &mut uniffi::deps::ffi_support::ExternError,
) -> () {
uniffi::deps::ffi_support::call_with_result(err, || -> Result<_, TodoError> {
let _obj = <std::sync::Arc<TodoList> as uniffi::ViaFfi>::try_lift(ptr).unwrap();
let _retval =
TodoList::add_item(&_obj, <String as uniffi::ViaFfi>::try_lift(todo).unwrap())?;
Ok(_retval)
})
}
```

Finally, when the foreign-language code frees the instance, it passes the integer handle to
a special destructor function so that the Rust code can delete it from the handlemap:
where the uniffi runtime defines:

```rust
pub extern "C" fn ffi_todolist_TodoList_object_free(handle: u64) {
UNIFFI_HANDLE_MAP_TODOLIST.delete_u64(handle);
}
unsafe impl<T: Sync + Send> ViaFfi for std::sync::Arc<T> {
type FfiType = *const std::os::raw::c_void;
fn try_lift(v: Self::FfiType) -> Result<Self> {
let v = v as *const T;
// We musn't drop the `Arc<T>` that is owned by the foreign-language code.
let foreign_arc = std::mem::ManuallyDrop::new(unsafe { Self::from_raw(v) });
// Take a clone for our own use.
Ok(std::sync::Arc::clone(&*foreign_arc))
}
```

This indirection gives us some important safety properties:
Notice that we take care to ensure the reference added by the constructor
remains alive. Finally, when the foreign-language code frees the instance, it
passes the raw pointer a special destructor function so that the Rust code can
drop that initial final reference (and if that happens to be the final reference,
the rust object will be dropped.)

* If the generated bindings incorrectly pass an invalid handle, or a handle for a different type of object,
then the handlemap will throw an error with high probability, providing some amount of run-time typechecking
for correctness of the generated bindings.
* The handlemap can ensure we uphold Rust's requirements around unique mutable references and threadsafey,
using a combination of compile-time checks and runtime locking depending on the details of the underlying
Rust struct that implements the interface.
```rust
pub extern "C" fn ffi_todolist_12ba_TodoList_object_free(ptr: *const std::os::raw::c_void) {
if let Err(e) = std::panic::catch_unwind(|| {
assert!(!ptr.is_null());
unsafe { std::sync::Arc::from_raw(ptr as *const TodoList) };
}) {
uniffi::deps::log::error!("ffi_todolist_12ba_TodoList_object_free panicked: {:?}", e);
}
}
```

## Managing Concurrency

By default, uniffi uses the [ffi_support::ConcurrentHandleMap](https://docs.rs/ffi-support/0.4.0/ffi_support/handle_map/struct.ConcurrentHandleMap.html) struct as the handlemap for each declared instance. This class
wraps each instance with a `Mutex`, which serializes access to the instance and upholds Rust's guarantees
against shared mutable access. This approach is simple and safe, but it means that all method calls
on an instance are run in a strictly sequential fashion, limiting concurrency.

For instances that are explicited tagged with the `[Threadsafe]` attribute, uniffi instead uses
a custom `ArcHandleMap` struct. This replaces the run-time `Mutex` with compile-time assertions
about the safety of the underlying Rust struct. Specifically:

* The `ArcHandleMap` will never give out a mutable reference to an instance, forcing the
underlying struct to use interior mutability and manage its own locking.
* The `ArcHandleMap` can only contain structs that are `Sync` and `Send`, ensuring that
shared references can safely be accessed from multiple threads.
You might be noticing a distinct lack of concurrency management, and this is
by design - it means that concurrency management is the responsibility of the
Rust implementations. The `T` in an `Arc<T>` is supplied by the Rust code
being wrapped and the Rust compiler will complain if that isn't `Send+Sync`.
This means that uniffi can take a hands-off approach, letting the Rust compiler
guide the component author.
69 changes: 68 additions & 1 deletion uniffi/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ pub mod deps {
pub use anyhow;
pub use bytes;
pub use ffi_support;
pub use lazy_static;
pub use log;
pub use static_assertions;
}
Expand Down Expand Up @@ -476,6 +475,74 @@ unsafe impl<V: ViaFfi> ViaFfi for HashMap<String, V> {
}
}

/// Support for passing reference-counted shared objects via the FFI.
///
/// To avoid dealing with complex lifetime semantics over the FFI, any data passed
/// by reference must be encapsulated in an `Arc`, and must be safe to share
/// across threads.
unsafe impl<T: Sync + Send> ViaFfi for std::sync::Arc<T> {
// Don't use a pointer to <T> as that requires a `pub <T>`
type FfiType = *const std::os::raw::c_void;

/// When lowering, we have an owned `Arc<T>` and we transfer that ownership
/// to the foreign-language code, "leaking" it out of Rust's ownership system
/// as a raw pointer. The foreign-language code is responsible for freeing
/// this via TODO calling a special destructor for `T`.
///
/// (This works safely because we have unique ownership of `self`).
fn lower(self) -> Self::FfiType {
std::sync::Arc::into_raw(self) as Self::FfiType
}

/// When lifting, we receive a "borrow" of the `Arc<T>` that is owned by
/// the foreign-language code, and make a clone of it for our own use.
///
/// Safety: the provided value must be a pointer previously obtained by calling
/// the `lower()` method of this impl.
fn try_lift(v: Self::FfiType) -> Result<Self> {
let v = v as *const T;
// We musn't drop the `Arc<T>` that is owned by the foreign-language code.
let foreign_arc = std::mem::ManuallyDrop::new(unsafe { Self::from_raw(v) });
// Take a clone for our own use.
Ok(std::sync::Arc::clone(&*foreign_arc))
}

/// When writing as a field of a complex structure, make a clone and transfer ownership
/// of it to the foreign-language code by writing its pointer into the buffer.
fn write<B: BufMut>(&self, buf: &mut B) {
let ptr = std::sync::Arc::clone(self).lower();
buf.put_u64(ptr as u64); // TODO: assertions about pointer size
}

/// When reading as a field of a complex structure, we receive a "borrow" of the `Arc<T>`
/// that is owned by the foreign-language code, and make a clone for our own use.
///
/// Safety: the buffer must contain a pointer previously obtained by calling
/// the `lower()` method of this impl.
fn try_read<B: Buf>(buf: &mut B) -> Result<Self> {
check_remaining(buf, 8)?;
Self::try_lift(buf.get_u64() as Self::FfiType)
}
}

// This type exists only because `IntoFfi` is not implemented for
// `*const std::os::raw::c_void`. It seems reasonable that it should be, so
// if we update ffi_support to do that, this can die entirely.
pub struct UniffiVoidPtr(pub *const std::os::raw::c_void);

use ffi_support::IntoFfi;
unsafe impl IntoFfi for UniffiVoidPtr {
type Value = *const std::os::raw::c_void;
#[inline]
fn ffi_default() -> Self::Value {
std::ptr::null_mut()
}
#[inline]
fn into_ffi_value(self) -> Self::Value {
self.0
}
}

#[cfg(test)]
mod test {
#[test]
Expand Down
1 change: 1 addition & 0 deletions uniffi_bindgen/src/bindings/gecko_js/gen_gecko_js.rs
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,7 @@ mod filters {
FFIType::Float32 => "float".into(),
FFIType::Float64 => "double".into(),
FFIType::RustCString => "const char*".into(),
FFIType::RustArcPtr => unimplemented!("object pointers are not implemented"),
FFIType::RustBuffer => context.ffi_rustbuffer_type(),
FFIType::RustError => context.ffi_rusterror_type(),
FFIType::ForeignBytes => context.ffi_foreignbytes_type(),
Expand Down
2 changes: 2 additions & 0 deletions uniffi_bindgen/src/bindings/kotlin/gen_kotlin.rs
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,8 @@ mod filters {
FFIType::Float32 => "Float".to_string(),
FFIType::Float64 => "Double".to_string(),
FFIType::RustCString => "Pointer".to_string(),
// XXX - make this `Pointer` (but things like Helpers.kt need upgrading in non-obvious ways. )
FFIType::RustArcPtr => "Long".to_string(),
FFIType::RustBuffer => "RustBuffer.ByValue".to_string(),
FFIType::RustError => "RustError".to_string(),
FFIType::ForeignBytes => "ForeignBytes.ByValue".to_string(),
Expand Down
1 change: 1 addition & 0 deletions uniffi_bindgen/src/bindings/python/gen_python.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ mod filters {
FFIType::Float32 => "ctypes.c_float".to_string(),
FFIType::Float64 => "ctypes.c_double".to_string(),
FFIType::RustCString => "ctypes.c_voidp".to_string(),
FFIType::RustArcPtr => "ctypes.c_voidp".to_string(),
FFIType::RustBuffer => "RustBuffer".to_string(),
FFIType::RustError => "ctypes.POINTER(RustError)".to_string(),
FFIType::ForeignBytes => "ForeignBytes".to_string(),
Expand Down
1 change: 1 addition & 0 deletions uniffi_bindgen/src/bindings/swift/gen_swift.rs
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ mod filters {
FFIType::Float32 => "float".into(),
FFIType::Float64 => "double".into(),
FFIType::RustCString => "const char*_Nonnull".into(),
FFIType::RustArcPtr => "UnsafeRawPointer".into(), // ??
FFIType::RustBuffer => "RustBuffer".into(),
FFIType::RustError => "NativeRustError".into(),
FFIType::ForeignBytes => "ForeignBytes".into(),
Expand Down
4 changes: 4 additions & 0 deletions uniffi_bindgen/src/interface/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ pub enum FFIType {
/// If you've got one of these, you must call the appropriate rust function to free it.
/// This is currently only used for error messages, and may go away in future.
RustCString,
/// A `*const c_void` pointer to a rust-owned `Arc<T>`.
/// If you've got one of these, you must call the appropriate rust function to free it.
/// The templates will generate a unique `free` function for each T.
RustArcPtr,
/// A byte buffer allocated by rust, and owned by whoever currently holds it.
/// If you've got one of these, you must either call the appropriate rust function to free it
/// or pass it to someone that will.
Expand Down
18 changes: 11 additions & 7 deletions uniffi_bindgen/src/interface/object.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ impl Object {
&self.name
}

pub fn type_(&self) -> Type {
Type::Object(self.name.clone())
}

pub fn constructors(&self) -> Vec<&Constructor> {
self.constructors.iter().collect()
}
Expand Down Expand Up @@ -127,8 +131,8 @@ impl Object {
pub fn derive_ffi_funcs(&mut self, ci_prefix: &str) -> Result<()> {
self.ffi_func_free.name = format!("ffi_{}_{}_object_free", ci_prefix, self.name);
self.ffi_func_free.arguments = vec![FFIArgument {
name: "handle".to_string(),
type_: FFIType::UInt64,
name: "ptr".to_string(),
type_: FFIType::RustArcPtr,
}];
self.ffi_func_free.return_type = None;
for cons in self.constructors.iter_mut() {
Expand Down Expand Up @@ -198,7 +202,7 @@ impl APIConverter<Object> for weedle::InterfaceDefinition<'_> {

// Represents a constructor for an object type.
//
// In the FFI, this will be a function that returns a handle for an instance
// In the FFI, this will be a function that returns a pointer to an instance
// of the corresponding object type.
#[derive(Debug, Clone)]
pub struct Constructor {
Expand Down Expand Up @@ -232,7 +236,7 @@ impl Constructor {
self.ffi_func.name.push('_');
self.ffi_func.name.push_str(&self.name);
self.ffi_func.arguments = self.arguments.iter().map(Into::into).collect();
self.ffi_func.return_type = Some(FFIType::UInt64);
self.ffi_func.return_type = Some(FFIType::RustArcPtr);
}

fn is_primary_constructor(&self) -> bool {
Expand Down Expand Up @@ -282,8 +286,8 @@ impl APIConverter<Constructor> for weedle::interface::ConstructorInterfaceMember

// Represents an instance method for an object type.
//
// The in FFI, this will be a function whose first argument is a handle for an
// instance of the corresponding object type.
// The FFI will represent this as a function whose first/self argument is a
// `FFIType::RustArcPtr` to the instance.
#[derive(Debug, Clone)]
pub struct Method {
pub(super) name: String,
Expand Down Expand Up @@ -313,7 +317,7 @@ impl Method {

pub fn first_argument(&self) -> Argument {
Argument {
name: "handle".to_string(),
name: "ptr".to_string(),
// TODO: ideally we'd get this via `ci.resolve_type_expression` so that it
// is contained in the proper `TypeUniverse`, but this works for now.
type_: Type::Object(self.object_name.clone()),
Expand Down
4 changes: 2 additions & 2 deletions uniffi_bindgen/src/interface/types/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ impl Into<FFIType> for &Type {
// Strings are always owned rust values.
// We might add a separate type for borrowed strings in future.
Type::String => FFIType::RustBuffer,
// Objects are passed as opaque integer handles.
Type::Object(_) => FFIType::UInt64,
// Objects are pointers to an Arc<>
Type::Object(_) => FFIType::RustArcPtr,
// Callback interfaces are passed as opaque integer handles.
Type::CallbackInterface(_) => FFIType::UInt64,
// Errors have their own special type.
Expand Down
Loading

0 comments on commit f43e4b8

Please sign in to comment.