-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ABI stability across major ICU4X versions #3790
Comments
Personally I'm moderately in favor of such a policy. People tend to use our FFI by checking in our bindings, so it's nice to be able to allow crate updates without requiring binding updates. We already are on board with breaking the API surface of the high-level FFI APIs across major versions; and we plan to do so in 2.0 in a major way for C++ and JS. A common mitigation strategy (available once we move to cpp2/js2) is using For example, I would like to sketch out all the cases where we would potentially want to break ABI beforehand before making this decision, and decide on mitigation strategies where necessary. Here are the cases I can think of:
|
I feel like we already decided on this. Diplomat ABI renames make it possible to retain stability whilst adding new APIs. I think we have a good path forward here. @sffc @zbraniecki @robertbastian does this seem acceptable? |
There's more to discuss here, including how to deal with options structs and things that take options structs, version suffixes like CollatorOptionsV1, ... |
Yes, I think we have a clear path for those as well in the face of Diplomat renames but I'll type it up later. |
@sffc , @robertbastian , and I came up with a set of principles for ABI stability. We have not yet fully figured out mechanisms (Notation: Principles: For a given major version, FFI bindings are stable for whatever stability means in that language. E.g. JS may avoid versioning options bags by just adding new fields over time, whereas everyone else may need to use ICU4XCollatorOptionsV1, ICU4XCollatorOptionsV2. ABI names are stable across versions, however they may be removed across versions. In other words, if you call the ABI function The C backend is somewhat special; it should never be confusing around its relationship with the ABI. It is fine for the C backend to diverge from the ABI: we may have In other words, we should never add a rename that would lead to a C backend name matching an ABI name that it does not map to. To a lesser extent, we want to make sure the C++ and C backends are not too confusingly different. Agreed on principles above: @sffc @Manishearth @robertbastian |
We also discussed having some global prefix/version suffix in the ABI, and did not come to consensus on this. This policy doesn't preclude doing this in the future. |
Thinking about this more, this only works in backends like JS. The policy above allows us to do this kind of thing. |
Potential mechanisms for achieving this policy: C aliases, Self-policing, limit non-blanket renamesWe treat the C backend as higher level than just "ABI", though the code may still retain the ability to generate ABI headers for CPP-C2. We use In ICU4X, we try to avoid having cases where the C backend name mismatches the ABI name. This does not preclude us from using renames: for example what we might do is have This means each rename involving C besides the default ICU4X-prefixing one should be carefully scrutinized. We use this scrutiny to avoid any clashes. When we are Obfuscated ABI namesWe obfuscate ABI names such that it is abundantly clear that they are not the same as the real name. Something like a global (This is a partial design but people can fill this in) |
I very much like an Obfuscated ABI Names approach. Advantages include
A few ways to do obfuscating: Apply name mangling with argument and return value typesBasically, the signature of a function is a product of its arguments. If we encode this into the ABI name, we can get a globally unique signature. Example: Pros:
Cons:
Explicit versionThis approach could also solve the struct versioning problem. We add an attribute Example: Pros:
Cons:
HashWe can pass the function signature through a hashing algorithm and then append the hash to the function name. Example: Pros:
Cons:
|
Another thing to note, since I don't think it was explicitly stated above: like with backwards-compatible data, we can build a DLL with both the old and the new symbols so that it can be dropped in-place to some binary that is expecting to find the old symbols. I'm not advocating that we go out of our way to figure out exactly how to do this in the 2.0 release (I'm okay simply dropping the old symbols in this release), but when we have someone who needs it, it is a tractable problem to solve. |
I would like the ABI and the C backend to be the same where possible and this is a divergence. That said, I'm not strongly opposed to this as long as it can be designed as a convenient alias for a pairing of (We can support a
This is still an argument-based name mangling scheme. This has the same "designing a good one is hard" con of the first name mangling approach. Ensuring stability for a hash based scheme where the set of hashables is growing over time and some of the things are equivalent (a bunch of our slice types are equivalent over FFI) is not easy and easy to get wrong. I am very opposed to any automatic mangling scheme: this makes it much easier for things to automatically change under our feet without us realizing it. The current design is quite clear about what actions can change the abi names.
I don't consider the first one to be too important. It's only a problem for one backend (C) and it's a backend where you probably want to be looking at the headers anyway. I'm also not convinced of the utility of the second one: I really prefer this being an explicit choice. I'm not convinced we can come up and maintain with something that is reliable, really I'm not convinced of my ability to not accidentally break this when making changes to Diplomat. |
I do not think a global |
That's not what I'm saying: I'm saying that we can have a Footnotes
|
Discussion: Concrete proposal: Types like CollatorOptions will be named CollatorOptionsV1 by default, but with The ABI type "CollatorOptions" should be the latest and greatest. May need defaults support in JS and Dart.
Achieved by:
The method version suffixes do not solve the options bag problem, but they are capable of solving the problem of replacing functions. LGTM: @Manishearth, @sffc, @robertbastian |
We may also want the default to be |
I think
|
On the diplomat side, this plan can be achieved more easily by:
On the plus side, this needs no new Diplomat work. However, ideally we get to 2.0 with:
|
In this scheme, are methods ever going to have |
@robertbastian yes, methods may change for reasons unrelated to struct versioning |
(I confirmed this with Shane yesterday, he wanted to keep method versioning. I vaguely recall this being discussed during the meeting that decided this, the set of use cases was broader than struct versioning) |
The overall description of what mv is for is that we're hoping to have cross-ICU4X versions stable ABI1, but not stable FFI API, which may mean updating methods and using renames with mv. Footnotes
|
For example, It prevents someone bringing an old dylib and using it with newer code that uses the same function names with different signatures. ICU4C does this, too, but it uses each major version in the ABI, like |
I do not recall us discussing this1 (correct me if wrong). We should come up with a policy.
Basically, are we allowed to break the C ABI across ICU4X versions? The practical effect of this would be that people using old ICU4X bindings with a new ICU4X crate can continue to stick to the bindings they use. The reverse is not necessarily guaranteed since new ICU4X bindings may use new APIs under the hood that aren't present in the binary.
We do have this policy for minor ICU4X versions: it is possible to use ICU4X 1.1 headers with a compiled
icu_capi 1.2
, in theory. We do not test this.In practical effects to ICU4X developers, this means
Discuss with:
Optional:
Footnotes
I recall us discussing what to do across minor versions and the answer was "definitely no breakage" ↩
The text was updated successfully, but these errors were encountered: