Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design and agree on a 1.0 FFI #2155

Open
bendk opened this issue Jun 17, 2024 · 7 comments
Open

Design and agree on a 1.0 FFI #2155

bendk opened this issue Jun 17, 2024 · 7 comments
Labels

Comments

@bendk
Copy link
Contributor

bendk commented Jun 17, 2024

Lately, we've been discussing the idea of UniFFI 1.0 and what it would entail. As part of this effort, we should think about how we could improve on our current FFI layer. There's basically 2 reasons for this:

  • Historically, UniFFI has favored shipping features quickly over performance, with the idea that we can revisit these decisions and improve on them later. This seems like a great time to revisit some of those decisions and improve upon them.
  • We have a lot of hindsight around implementing bindings generators which we can use to simplify things. This will make our current bindings generation code more maintainable and make it easier for new bindings to be written.

General framework for making these decisions:

  • This issue will be act as a meta issue for this work and track decisions around the overall direction. Please add your thoughts in the comments and we can update the plan as we go.
  • The goal is get all stakeholders to come to a consensus on an FFI that's "good enough". This includes UniFFI users, UniFFI devs, bindings authors, developers of the ecosystems that UniFFI is being introduced into, etc.
  • Issues for specific tasks will be labeled with FFI-1.0.
  • There's no rush to get this done and we should expect that some of these issues will require slow and steady discussion before we can come to consensus.
  • Changes should be justified based on improved performance, simplified binding generation code, or both.
  • If a change is claimed to improve performance, there should be a benchmark added that measures this.
@bendk bendk added the FFI-1.0 label Jun 17, 2024
@bendk
Copy link
Contributor Author

bendk commented Jun 17, 2024

As mentioned in the description, this issue is a working document. The above represents my understanding of the issue after talking it over with a couple other UniFFI devs. If you disagree with anything or feel like something's missing, please add a comment.

@skeet70
Copy link
Contributor

skeet70 commented Jun 20, 2024

I agree that performance is something that should be optimized more before 1.0. Once we have a uniffi-bindgen-java version of our library that already exists natively in Java, we'll be benchmarking ourselves and may come up with ideas for improvements. Right now we have Rust, Python, and Kotlin benchmarks for that library but don't have a good comparison point to find improvements there.

Naively it seems to me like 1.0 should have either one of proc-macros or UDL but not both. I think that will simplify usage and make it easier to come up with a cohesive set of documentation.

Even though I'm implementing Java bindings, that's primarily translating Kotlin bindings, so I don't grok the bindgen internals well enough to give good improvement feedback there. I do think there's a lot to be gained in good documentation of what templates/code types need to be implemented and handled for bindings to support each feature UniFFI supports, but that's not really a maintenance improvement.

@Sajjon
Copy link
Contributor

Sajjon commented Jul 7, 2024

Hey! Exciting with 1.0 🚀

I will re-iterate up some things I wrote here, I think bindings can be improved quite a bit - also with multi-crate setup in mind. Instead of generating huge binding files - one per UniFFI crate (in a possible multi-UniFFI-crate-workspace repo) - also with multiple copies various UniFFI FfiConverters, which at least is the case for Swift, we could generate many small files. Where meant to be internal symbols have appropriate access modifiers (internal in Swift), so that those symbols are visible across those files.

For Swift specifically we ought to re-design with Swift 6 language mode and strict-concurrency checking set to complete - because UniFFI generated Swift binding code does not compile, as I mentioned in #2046. This is not a huge issue right now, but in September when Xcode 16 goes from beta to prod, and developers wanna adopt Swift 6 language mode, this will be a growing problem.

I think it is excellent timing to redesign the Swift bindings part of 1.0 of UniFFI to be prepared for Swift 6 and complete strict concurrency checking. I have some ideas, but I think it would be better to first make the changes to how bindgen generates code, into many files rather than one huge, and for multiple crates, ensure each UniFFI FfiConverter is only generated once, with internal as access modifier.

@mhammond
Copy link
Member

mhammond commented Jul 8, 2024

but I think it would be better to first make the changes to how bindgen generates code, into many files rather than one huge, and for multiple crates, ensure each UniFFI FfiConverter is only generated once, with internal as access modifier.

That can be done today, all the components are passed to the swift generator.

All the rest sounds great too!

@skeet70
Copy link
Contributor

skeet70 commented Oct 3, 2024

@Sajjon for reference uniffi-bindgen-java also generates individual files already.

I found out today about JNR. I think @badboy had a branch already trying out JNA direct mapping for performance benefits. Looking at benchmarks it may be worth trying out JNR. This would only help Kotlin/Java performance, not a general library improvement.

@alexkirsz
Copy link
Contributor

I've already written up some of what we'd like to have in this comment and this issue, but I'll expand a bit more here.

We maintain a very large monorepo that contains code for multiple apps. Each app requires a particular set of features, but many of these features are shared between two or more apps. As such, we've split each feature into its own crates:

  1. the API crate, which only contains the UniFFI interfaces (as traits) and types specific to this feature;
  2. the implementation crate, which contains the actual implementation code for our traits; and
  3. shared type library crates, which contain shared types between multiple features (e.g. a Vec2 type, or a Path type).

In order to keep compilation times to a minimum and avoid scaling issues, we run UniFFI bindgen on each library that exposes a UniFFI interface independently. This means that API crates, implementation crates, and type library crates will each invoke the UniFFI bindgen binary once, instead of a single invocation on the output shared library.

This results in the following compilation graph:

┌───────┐  ┌───────┐  ┌───────┐
│TYPES 1│  │TYPES 2│  │TYPES 3│
└───────┘  └───────┘  └───────┘
    │          │          │    
    └────┬─────┘          │    
         ▼                │    
      ┌─────┐             │    
      │API A│             │    
      └─────┘             │    
         │                │    
         ├────────────────┤    
         ▼                ▼    
     ┌──────┐          ┌─────┐ 
     │IMPL A│          │API B│ 
     └──────┘          └─────┘ 
         │                │    
         │                ▼    
         │           ┌────────┐
         └───────┐   │ IMPL B │
                 │   └────────┘
                 │        │    
                 ├────────┘    
                 ▼             
            ┌────────┐         
            │ TARGET │         
            └────────┘         

Each library has its own two-step compilation process:

  1. Compile the library into a .rlib;
  2. Codegen the library's foreign binding;
  3. Compile the library's foreign binding.

And many such two-step compilation processes can be parallelized when there is no directed edge between their corresponding libraries.

This model also ensures that we can re-compile the foreign interface very fast, since API and types crates are guaranteed to only depend on each other, and not to include extraneous code that would otherwise increase the compilation type. As such, in most cases, iOS, Android, and Web engineers do not have to pay the cost of the implementation compilation times when working on features.

However, in order to make this work, we've made a small patch to UniFFI bindgen to allow it to read from multiple rlibs directly. For instance, in order to generate the foreign bindings for API A, we need to include the rlibs of TYPES 1, TYPES 2, and API A. This will properly generate the foreign code for API A, but will also re-generate the foreign bindings of TYPES 1 and TYPES 2, which we'll already have generated and compiled upstream.

Another issue is that all generated foreign bindings expect to be shipped as part of the same foreign library. For instance, in Swift, this means a single Swift library. As such, they all expect the other modules' symbols to already be imported in the current scope. However, in our case, these symbols need to be imported from the other Swift libraries: the foreign bindings for API A will need to import Types1FFI and import Types2FFI.

So in order to make this work, we:

  1. Run UniFFI bindgen on all rlibs necessary for a given library; then
  2. Select the generated binding file for the library, discarding the rest; then
  3. Prepend necessary imports to the generated file.

The drawbacks of this approach are:

  1. Some binding files will be generated many times, but only compiled once. For instance, in the graph above, the bindings of the TYPES 1 library will be generated 6 times: as part of its own compilation step, as well as part of any dependent library and all its transitive dependent libraries. However, it will only actually be compiled once, as part of its own compilation step. All the other generations are superfluous. This isn't too big of a problem for us at the moment since codegen is generally very fast, but as we scale to more features and more complex graphs, it might become one.
    → The bindgen binary should accept a parameter to select which module we want to generate bindings for.

  2. Similarly, the rlibs need to be parsed multiple times by each UniFFI bindgen invocation.
    → The bindgen binary could generate an intermediate representation for UDLs and proc macro metadata. This would avoid repeating some of the parsing work, as well as embed additional metadata like the library name (see point 4.)

  3. Each binding file will include its own RustBuffer implementation, as well as converter code for all the primitive types it uses.
    → The bindgen binary could expose a command to generate a library that contains all the boilerplate code, without any filtering. All generated binding files could then import this shared library.

  4. Imports must be prepended as an additional step after the codegen.
    → The intermediate representation mentioned in 2. could include the expected import name/namespace of the dependency library, which would allow the codegen to both add the import (in Swift) and/or properly namespace any reference to a type coming from another library to avoid conflicts.

Currently, our implementation is tailored to our needs and, as such, cuts many corners and breaks some features we don't use. We would love to see UniFFI support this natively.

@bobozaur
Copy link
Contributor

However, in order to make this work, we've made a small patch to UniFFI bindgen to allow it to read from multiple rlibs directly. For instance, in order to generate the foreign bindings for API A, we need to include the rlibs of TYPES 1, TYPES 2, and API A. This will properly generate the foreign code for API A, but will also re-generate the foreign bindings of TYPES 1 and TYPES 2, which we'll already have generated and compiled upstream.

Just wanted to add here that we were able to make use of multiple rlibs by creating our own bindgen binary using lower level API from vanilla uniffi. Pretty sure that it might be possible to cache/import bindings generated upstream too, but, as @alexkirsz pointed out, code gen is fast enough for this not to be a concern. At least for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants