-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC for step 1 of #419 - replace Handlemaps with Arcs. #420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty neat! I'm kind of strangely glad to see a SIGSEGV in the circle failures here, means we're definitely in raw pointer territory :-)
I'd like to consider whether we can embed the use of Arc here more thoroughly in the mechanics of how we work with object instances, so will share some rambling thoughts about how that might look architecturally.
First, observe that when we generate the support code for records, we generate an implementation of the ViaFfi
trait that knows how to read/write then in a buffer. We currently do not implement ViaFfi
for object types at all.
What if we did update ObjectTemplate.rs
to produce a ViaFfi
implementation for object types? It might look something like this:
// This won't work as written IIRC, because this code doesn't define `ViaFfi` and
// doesn't define `Arc`, so Rust's trait coherence rules will stop us from impling
// the trait on `Arc<ObjectName>`. But it's interesting to think about architecturally.
// Maybe we could define a newtype wrapper that would let this work?
unsafe impl uniffi::ViaFfi for Arc<ObjectName> {
// When passing object instances over the FFI, we pass a pointer
// that is the raw form of an Arc<ObjectName> owned by the foreign-language code.
type FfiType = *const ObjectName;
// If we return an `Arc<ObjectName>` to the foreign-language code, then it
// owns the reference and is responsible for freeing it.
fn lower(self) -> Self::FfiType {
self.into_raw()
}
// If we want to receive an `Arc<ObjectName>` from the foreign-language code,
// we have to make our own clone of it.
fn try_lift(v: Self::FfiType) -> Result<Self> {
// The received pointer is an Arc owned by foreign-language code, don't drop it.
let foreign_arc = ManuallyDrop::new(unsafe { Self::from_raw(v) });
// Take a (cheap) clone for our own use.
Ok(Arc::clone(foreign_arc))
}
// If we want to send a data structure with an `Arc<ObjectName>` in one of its
// fields, we pass ownership of a clone by passing the pointer.
fn write<B: bytes::BufMut>(&self, buf: &mut B) {
let foreign_arc = Arc::clone(self);
buf.put_u64(foreign_arc.into_raw() as u64) // Probably need some sizeof() checks here :-/
}
// If we receive a serialized data structure with an `Arc<ObjectName>` in one of its
// fields, we need to take a clone for our own purposes.
fn try_read<B: bytes::Buf>(buf: &mut B) -> Result<Self> {
Self::try_lift(buf.get_u64() as const* ObjectName)
}
}
I wonder if something like this would help make the generated code any cleaner?
@@ -232,6 +232,7 @@ impl Constructor { | |||
self.ffi_func.name.push_str("_"); | |||
self.ffi_func.name.push_str(&self.name); | |||
self.ffi_func.arguments = self.arguments.iter().map(Into::into).collect(); | |||
// XXX - we need this to be some kind of pointer abstraction - `usize`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to work with raw pointers, I think the right thing to do here is add an FFIType::Pointer
variant and then map it explicitly to the corresponding pointer abstractions in each foreign language. (The existing FFITypeRustCString
variant actually already is sort of this, but specifically for a pointer-to-char).
&*{{- arg_name }} | ||
{% else %} | ||
&mut *{{- arg_name }}.lock().unwrap() | ||
{% endif %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dereferencing/unwrapping logic...I wonder if it makes sense to move some or all of that into the definition of the lift_rs
filter. It seems pretty analogous to me - this is the stuff that you have to do to go from the value that you get over the FFI, to the value that the Rust code needs to use at runtime.
{% endmatch %} | ||
{% endmacro %} | ||
|
||
{% macro get_arc(obj) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think calling this clone_arc
would be clearer about your intent.
let _arc = unsafe { std::sync::Arc::from_raw(ptr as *const std::sync::Mutex<{{ obj.name() }}>) }; | ||
{% endif %} | ||
// This arc now "owns" the reference but we need an outstanding reference still. | ||
std::sync::Arc::into_raw(std::sync::Arc::clone(&_arc)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using ManuallyDrop
rather than leaking it back via into_raw
, as I think it better captures the intent. If I had to write this code out by hand with lots of comments it would look something like:
// The pointed-to Arc is owned by the foreign-language code, so we mustn't drop it.
let _foreign_arc = ManuallyDrop::new(unsafe { Arc::from_raw(ptr as *const {{ obj.name() }}) });
// We want an owned copy of the Arc to pass internally to the Rust code.
let _arc = Arc::clone(&*_foreign_arc);
uniffi::deps::ffi_support::call_with_result( | ||
err, | ||
|| -> Result<_, {{ e }}> { | ||
{% call get_arc(obj) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a macro for this, what if this was done via the existing "lift" mechanics, something like:
let _arc = {{meth.first_argument().name()|lift_rs(meth.first_argument().type_()) }}
Broadly, I think the more we can treat these object-typed arguments in the same way as we treat other arguments, the better.
let _new = std::sync::Mutex::new({{ obj.name() }}::{% call to_rs_call(cons) %}); | ||
{%- endif %} | ||
let _arc = std::sync::Arc::new(_new); | ||
std::sync::Arc::into_raw(_arc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I know this isn't intended for merge, but I'm commenting to share my mental model of how the final code might look).
This needs a comment along the lines of "This copy of the Arc is leaked, and is conceptually now owned by the foreign-language code. It will only be dropped when the foreign-language code calls the matching desctructor".
{% let ffi_free = obj.ffi_object_free() -%} | ||
#[doc(hidden)] | ||
#[no_mangle] | ||
pub extern "C" fn {{ ffi_free.name() }}(ptr: u64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I really think we want to update the FFIType
enum to let us use a pointer type in the argument here.
Actually, if this seems useful then we might be able to have uniffi provide a default |
b83d272
to
dde07a5
Compare
Callbacks are broken - possibly for many reasons, but at least one is due to the constructors creating a I'm going to put this down for a few days, so let me know if anyone is interested in picking any of this up. |
The context for this is Ryan's [proposal for passing around object instances]( mozilla#419) By ipmlementing `ViaFfi` for `Arc<T>` we get a huge amount of functionality for free - `Arc` wrappers around Objects can suddenly appear in dictionaries and function params, and be returned from functions once we remove the explicit code that disallows it. I kept support for both threadsafe and non-threadsafe interfaces, primarily to avoid touching all the examples. This turned out to be quite easy in general, because obviously `Arc<T>` ends up getting support for `Arc<Mutex<T>>` for free. The mutex semantics for using these as params etc is almost certainly going to do the wrong thing - eg, acquiring a mutex when the object is passed as a param is going to end in tears at some stage. It's also incomplete (needs docs, kotlin tests for the new test cases, swift hasn't been touched at all, etc)
This is just the rust side of the world, and functions must return `std::sync::Arc<Self>` rather than plain `Self` - not sure if that's a bug (not seamless and natural) or a "feature" (by closing existing `Arc`s offers a path to an "identity" concept) Note that it would be fairly simple to allow objects returned by Rust to not be `Arc<>` because `Arc<T>` automatically supplies `From<T>` - so we just need to generate a few `.into()` calls. Obviously though, this will create a new `Arc` and not a clone of an existing one, so identity will be lost. It also more difficult to fix the other direction (ie, when the FFI layer has a ptr to an `Arc<T>` but the rust code expects just `&T`) It's possibly doable, but as above, it's not clear whether we actually *want to* do that - asking the author to explicitly manage `Arc<>`s seems less likely to cause surprises down the track - we made that mistake before with Mutexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like how this is coming together, thanks for pushing on it! In particular I'm really happy with how small the diff seems to be, given the conceptual impact of the change. (Other language bindings still to come, obviously, but still...).
I'll try to find some time to try my hand at the Kotlin bindings to help move this forward and shake out any other issues.
How should our work here relate to the ADR in #430? Right now I can convince myself that we're just prototyping to help inform a decision in that ADR, but if we keep going we'll eventually get to a point where we're actually assuming the outcome of that discussion.
fn take_other(&self, other: Option<Arc<Self>>) { | ||
*self.other.lock().unwrap() = match other { | ||
None => None, | ||
Some(arc) => Some(Arc::clone(&arc)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we already own the Arc<Self>
in other
at this point, so I don't think we need to clone here, and could just *self.other.lock().unwrap() = other
. The current setup will increment the refcount for the Arc::clone
, then immediately decrement it when other
is dropped at the end of the function.
# Any test not terminating with zero objects alive will cause others to | ||
# fail - this helps us work out which test kept things alive. | ||
def tearDown(self): | ||
self.assertEqual(get_num_alive(), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 nice!
{# Look in our weak map for an instance already associated with this | ||
pointer. If it exists we can return it, but the pointer we have | ||
is a clone of an `Arc<>` so has a reference we need to destroy. | ||
#} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we probably want a mutex or other concurrency-control here around the updates to the identity map.
/// When lowering, we have an owned `Arc<T>` and we transfer that ownership | ||
/// to the foreign-language code, "leaking" it out of Rust's ownership system | ||
/// as a raw pointer. The foreign-language code is responsible for freeing | ||
/// this via TODO calling a special destructor for `T`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah! We already have one of these in obj.ffi_object_free().name()
and it just needs to be updated, so there's not really anything "TODO" about this except for document it here.
@@ -108,10 +108,12 @@ def writeString(self, v): | |||
|
|||
{% when Type::Object with (object_name) -%} | |||
# The Object type {{ object_name }}. | |||
# Objects cannot currently be serialized, but we can produce a helpful error. | |||
# We write the pointer value directly - what could possibly go wrong? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤞🏻😅🤞🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"What could possibly go wrong?" they ask, tempting fate...#457
/// This supports the `[ByRef]` attribute for arguments that should be passed | ||
/// by reference in the generated Rust scaffolding. | ||
/// This supports the `[ByRef]` and `[ByArc]` attribute for arguments that | ||
/// should be passed by reference in the generated Rust scaffolding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the "passed by reference" part of this comment will need re-wording.
} | ||
|
||
impl TryFrom<&weedle::attribute::ExtendedAttributeList<'_>> for ArgumentAttributes { | ||
type Error = anyhow::Error; | ||
fn try_from( | ||
weedle_attributes: &weedle::attribute::ExtendedAttributeList<'_>, | ||
) -> Result<Self, Self::Error> { | ||
// XXX - how do we ensure both `ByRef` and `ByArc` aren't specified? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the InterfaceAttributes
struct has an extremely dumb check to ensure that [Threadsafe]
and [Enum]
are not applied to the same interface.
@@ -184,10 +188,12 @@ impl APIConverter<Argument> for weedle::argument::SingleArgument<'_> { | |||
Some(v) => Some(convert_default_value(&v.value, &type_)?), | |||
}; | |||
let by_ref = ArgumentAttributes::try_from(self.attributes.as_ref())?.by_ref(); | |||
let by_arc = ArgumentAttributes::try_from(self.attributes.as_ref())?.by_arc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually wondering if [ByArc]
is really necessary on arguments, given that we've simplified the proposal down quite a bit. If you're passing something that's not an Object
instance then [ByArc]
is invalid. If you're passing something that is an Object
instance then you have to be passing it in an Arc
. So I'm not sure what value the annotation provides, except maybe for reminding the component author that they need to accept an Arc<T>
instead of a T
. (But even then, the Rust compiler will also remind them of this fact).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're passing something that is an Object instance then you have to be passing it in an Arc.
Hrm, I guess this isn't necessarily true - the bindings could take an &T
out of the Arc<T>
and pass it to the hand-written Rust code. But we could consider having f(T arg)
mean "receives an owned value in an Arc" and f([ByRef] T arg)
mean "receives a borrowed reference", which is how it works for non-Object types.
@@ -327,6 +332,10 @@ impl Method { | |||
self.attributes.get_throws_err() | |||
} | |||
|
|||
pub fn self_by_arc(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parsing center of my brain wants the name of this to be worded as a predicate about the object, e.g. takes_self_by_arc
or similar. But then I can't really articulate a cohesive style guide for the way things are named in here, so I'm just going to note it and move on.
Funny you should ask - I was just making note about that. When we chatted in person, I mentioned that I thought it would be possible to still use a Thinking more though, I think that would work OK right until the last commit in the set - the one that provides "identity" for Python. That relies on the exact same value for the same objects, and the existing handle maps do not offer that - it probably could with more complex hoops, but I don't think that's worthwhile. So assuming we want to lean in on offering identity for languages which make it possible, I'm inclined to suggest we use this justification to push forward with #430. |
Sold! I'll take another look at that ADR next, and if it's welcome, perhaps put up a PR to your PR with some more thoughts. |
Obviously welcome! We could collaborate on the same branch, but I guess PRs on PRs gives us some kind of audit-trail for reviews - I review your PRs into my PR, you review my PR into main :) |
InternalError, | ||
_UniFFILib.{{ obj.ffi_object_free().name() }}, | ||
handle | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I just understood what you're doing here, and it's neat.
In my head I was worried about the asymmetry where we return a fresh clone of the Arc<T>
on the Rust side but collapse them all into a single instance on the foreign-language side. I thought we might have to keep track of the number of Arc<T>
that had been merged into this instance, so that we could call ffi_object_free
the appropriate number of times. But immediately destroying the clones is much simpler (and really reinforces the "The Prestige" metaphor that @jhugman used to describe this identity map idea to me at one point...).
In the future we might be motivated to keep a count and destroy them all at once for performance reasons, but that seems likely to be a long ways off.
This is a nice little example of how to use object references, and also happens to help us test a way that we want to use them in a real consumer.
Add a singleton-style feature to the "todolist" example.
def test_arcs(self): | ||
coveralls = Coveralls("test_arcs") | ||
self.assertEqual(get_num_alive(), 1) | ||
self.assertEqual(coveralls.strong_count(), 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check my understanding: the strong count is 2 because 1 reference is held by the foreign-language code, and one reference is held by the Rust code for the duration of the call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in this specific example, 1 is in coveralls._handle
and the other is a new Arc created for self: Arc<Self>
I'm going to close this in favour of a rebased and squashed commit in #462, which incorporates some parts of this that have already been taken onto |
The context for this is Ryan's proposal for passing around object instances - which has steps 1 and 2 in a rough way - replacing Handlemaps with
Arc<>
(or to-be-deprecatedArc<Mutex<>>
) and allowing them to be args and results.This is turning out better than I hoped - for example, see this new test which exercises most of what this does.
I think the Kotlin generation works, but the Swift generation most certainly does not. @jhugman now you are back from PTO I'd love your feedback and to gauge your interest in collaborating on this? Note that my intent is to iterate slowly, just in time for Ryan to get back and review.