-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving icu_harfbuzz
compiled data
#4574
Conversation
Before I review this, were you aware of #3257, servo/rust-harfbuzz#197, and servo/rust-harfbuzz#245 when writing this PR? |
I found it after. I do think the changes here are independent of that though, we'll need a compiled data and a data provider path if we want maximum performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not convinced this PR is an improvement.
- It creates two versions of the
icu4x_hb_unicode_*
functions, which seem to be one for static data and one for provider data. We try really hard to avoidcfg(not(...))
code because it's very difficult to ensure test coverage. - The PR seems to be based on the claim that the current code "uses static_to_owned, which might be blocking optimisations". I'm fairly concerned about this statement because
static_to_owned
is clearly documented as being "cheap", and it is aconst
function. If there is a problem with that function, should we go back to the drawing board? It is designed, I believe, specifically for cases like this where we want to coalesce static and dynamic data down to a single code path.
Coalescing static data into the dynamic codepath inherently blocks optimisations. As soon as a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't really like the cfg(not(...))
code. Can we instead export two different C functions with different names, one which reads from user_data
and one which reads from compiled data? And then export two versions of Rust pub fn new_hb_unicode_funcs
, one which uses compiled data and the other which uses dynamic data. The compiled data versions are behind the compiled_data
feature; it's just that the dynamic data versions are still available even if the compiled_data
feature is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code duplication is annoying but hopefully that will go away with #3257
However, note that servo/rust-harfbuzz#197 always uses user_data
. It takes the void*
and casts it to the generic type parameter and calls the trait function with that pointer as the self
parameter. So, migrating this code over to the new harfbuzz traits will be slightly different. I guess we would just ignore the self
parameter in the compiled_data code path.
I assume you did not touch any of the unsafe FFI logic.
hb_unicode_funcs_set_combining_class_func( | ||
ufuncs.raw, | ||
Some({ | ||
extern "C" fn cb( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, with the Rust fns all named cb
? I guess the names get mangled? Nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
The current HB compiled data code is not ideal, because it uses
static_to_owned
, which might be blocking optimisations. This also let me realise that some compiled constructors that could beconst
aren't.