Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate functional from ergonomic API (add cbindgen) #398

Closed
sffc opened this issue Nov 19, 2020 · 16 comments
Closed

Separate functional from ergonomic API (add cbindgen) #398

sffc opened this issue Nov 19, 2020 · 16 comments
Assignees
Labels
A-design Area: Architecture or design A-ffi Area: FFI, WebAssembly, Transpilation C-meta Component: Relating to ICU4X as a whole T-core Type: Required functionality
Milestone

Comments

@sffc
Copy link
Member

sffc commented Nov 19, 2020

No description provided.

@sffc sffc added T-core Type: Required functionality C-meta Component: Relating to ICU4X as a whole A-design Area: Architecture or design labels Nov 19, 2020
@sffc sffc added this to the 2020 Q4 milestone Nov 19, 2020
@sffc sffc self-assigned this Nov 19, 2020
@sffc sffc added the A-ffi Area: FFI, WebAssembly, Transpilation label Nov 24, 2020
@sffc
Copy link
Member Author

sffc commented Nov 24, 2020

I think the best first step here is to try to get an ICU4X component like PluralRules to compile with cbindgen.

It's not hard to get started. I annotated the try_new function with #[no_mangle] and extern as follows:

    #[no_mangle]
    pub extern fn try_new<'d, D: DataProvider<'d>>(
        langid: LanguageIdentifier,
        data_provider: &D,
        type_: PluralRuleType,
    ) -> Result<Self, PluralRulesError>

And the PluralRules struct with #[repr(C)]:

#[repr(C)]
pub struct PluralRules

Then I can run cbindgen --crate icu_plurals. I get:

$ cbindgen --crate icu_plurals
WARN: Can't find LanguageIdentifier. This usually means that this type was incompatible or not found.
WARN: Can't find LanguageIdentifier. This usually means that this type was incompatible or not found.
WARN: Can't find D. This usually means that this type was incompatible or not found.
#include <cstdarg>
#include <cstdint>
#include <cstdlib>
#include <ostream>
#include <new>

/// A type of a plural rule which can be associated with the [`PluralRules`] struct.
///
struct PluralRuleType;

/// A list of possible error outcomes for the [`PluralRules`](crate::PluralRules) struct.
///
struct PluralRulesError;

template<typename T = void, typename E = void>
struct Result;

/// An enum storing models of
/// handling plural rules selection.
struct RulesSelector;

/// `PluralRules` is a struct which provides an ability to retrieve an appropriate
/// [`Plural Category`] for a given number.
///
/// # Examples
///
/// ```
/// use icu_locid_macros::langid;
/// use icu_plurals::{PluralRules, PluralRuleType, PluralCategory};
/// use icu_provider::InvariantDataProvider;
///
/// let lid = langid!("en");
///
/// let dp = InvariantDataProvider;
///
/// let pr = PluralRules::try_new(lid, &dp, PluralRuleType::Cardinal)
///     .expect("Failed to construct a PluralRules struct.");
///
/// assert_eq!(pr.select(5_usize), PluralCategory::Other);
/// ```
///
/// [`ICU4X`]: ../icu/index.html
/// [`Plural Type`]: PluralRuleType
/// [`Plural Category`]: PluralCategory
struct PluralRules {
  LanguageIdentifier _langid;
  RulesSelector selector;
};

extern "C" {

/// Constructs a new `PluralRules` for a given locale, [`type`] and [`data provider`].
///
/// This constructor will fail if the [`Data Provider`] does not have the data.
///
/// # Examples
///
/// ```
/// use icu_locid_macros::langid;
/// use icu_plurals::{PluralRules, PluralRuleType};
/// use icu_provider::InvariantDataProvider;
///
/// let lid = langid!("en");
///
/// let dp = InvariantDataProvider;
///
/// let _ = PluralRules::try_new(lid, &dp, PluralRuleType::Cardinal);
/// ```
///
/// [`type`]: PluralRuleType
/// [`data provider`]: icu_provider::DataProvider
Result<PluralRules, PluralRulesError> try_new(LanguageIdentifier langid,
                                              const D *data_provider,
                                              PluralRuleType type_);

} // extern "C"

I guess that's a start! I'm not really sure where to go next. In particular, I don't know how to fix the errors on top:

WARN: Can't find LanguageIdentifier. This usually means that this type was incompatible or not found.
WARN: Can't find LanguageIdentifier. This usually means that this type was incompatible or not found.
WARN: Can't find D. This usually means that this type was incompatible or not found.

I annotated LanguageIdentifier with #[repr(C)], but that didn't seem to do anything.

Once we can get some sort of PluralRules compiling to FFI in ICU4X, the next step will be to work on how the FFI relates to the data provider. I think, at least, we should add a new API that builds a PluralRules directly from data structs and other elementary components, rather than abstract traits/interfaces.

CC @Manishearth @nciric

@sffc sffc added the discuss Discuss at a future ICU4X-SC meeting label Nov 24, 2020
@zbraniecki
Copy link
Member

Do we want cbindgen or cxx?

@Manishearth
Copy link
Member

I annotated LanguageIdentifier with #[repr(C)], but that didn't seem to do anything.

I think it needs to be all the way down. What we need to do is one of:

  • Declare this function as taking an &LanguageIdentifier so it can operate on opaque objects (we need a way to construct these that has the right ownership model
  • Declare a separate FFI-compat LanguageIdentifier type that can be appropriately converted in Rust
  • Make everything repr(C), including TinyStr. We could potentially have a cfg that does this so that normal rust consumers aren't affected by the loss of layout optimizations.

Do we want cbindgen or cxx?

cxx could work, but it has a different model. I figured cbindgen would be useful since you need to do a lot of that work for wasm anyway. Worth trying cxx!

@Manishearth
Copy link
Member

WARN: Can't find D. This usually means that this type was incompatible or not found.

For generics we'd have to do some kind of dynamic dispatch instead, sadly.

@filmil
Copy link
Contributor

filmil commented Nov 24, 2020

FWIW.

We can not expect that we can just run either cbindgen or cxx on arbitrary Rust code, and that we will get sensible output. #[repr(C)] is necessary, but not sufficient. And frankly I'm not sure that "repr(C) all the way" is what we want anyways.

The API for the code that you expect to be usable over FFI needs to be designed with FFI in mind. IIUC, cbindgen doesn't help us at all with that, while cxx might -- though it's not there yet.

Some types such as Result<_,_> don't really have a repr(C), so that needs to be taken into account too. It would be cool to have a library that handles the representation differences, and it looks like this is something that could be generated automatically. cxx goes some of the way.

@sffc
Copy link
Member Author

sffc commented Nov 24, 2020

Clearly the FFI layer needs to be written in a way to make it C- and WASM-compatible, with simpler types. In wrapper_layer.md, I proposed making the Rust API wrap over the FFI API. I'm wondering whether this is the right approach, or whether the FFI API should be a "standalone" wrapper over the Rust API, more like in ICU4C. I would prefer making the Rust API wrap FFI, but I don't want to do that if it incurs a significant cost (in safety, performance, maintenance, etc).

@Manishearth
Copy link
Member

So the typical way to do this is to have intermediate FFI structs that contain the same data and are cheaply convertible to the Rust format.

"Rust API wrap FFI" is rarely a good idea since you lose out on all of the benefits of Rust enums/etc.

@sffc sffc changed the title Separate functional from ergonomic API Separate functional from ergonomic API (add cbindgen) Dec 4, 2020
@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Dec 4, 2020
@sffc
Copy link
Member Author

sffc commented Dec 9, 2020

We should schedule a deep-dive session where we figure this out, similar to how we did with WASM.

@sffc sffc modified the milestones: 2020 Q4, 2021-Q1-m1 Jan 7, 2021
@Manishearth
Copy link
Member

I might also do some experiments with pluralrules

@Manishearth
Copy link
Member

FWIW, i've filed mozilla/cbindgen#646, which will eventually let us directly pass TinyStr over FFI, until then we'll have to make a shim type.

@Manishearth
Copy link
Member

cbindgen (master) is now compatible, opened zbraniecki/tinystr#30 for making tinystr work

@Manishearth
Copy link
Member

Got some experiments up at #453

@sffc sffc modified the milestones: 2021-Q1-m1, 2021-Q1-m2 Feb 3, 2021
@sffc sffc removed their assignment Feb 3, 2021
@eerhardt
Copy link

Any idea when this will be started/implemented?
As I stated in the discussion I opened yesterday, we are investigating using ICU4X in the .NET runtime. However, in order to invoke it, we will need to call it through a "C" API.

@Manishearth
Copy link
Member

We have a rough plan for this and I hope to start implementing it soonish. It should still be a few months out though.

@Manishearth
Copy link
Member

Currently blocked on mozilla/cbindgen#665

@sffc sffc modified the milestones: 2021-Q1-m2, ICU4X 0.2 Mar 12, 2021
@sffc sffc mentioned this issue Mar 12, 2021
@Manishearth
Copy link
Member

I think the design work here was mostly resolved in #552 (and this issue can be closed, or kept around as an FFI tracking issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design A-ffi Area: FFI, WebAssembly, Transpilation C-meta Component: Relating to ICU4X as a whole T-core Type: Required functionality
Projects
None yet
Development

No branches or pull requests

5 participants