-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a 3-byte type with GIGO convertibility to/from char in zerovec #1968
Comments
CC @Manishearth |
This feels rather specific to the use case, I don't know if this belongs in zerovec but it can be designed pretty easily outside of zerovec, you just have to implement AsULE and ULE appropriately. Happy to provide guidance on that. In general i intend zerovec to be a generally useful crate, not one specific for i18n, and as such I view APIs under that lens when considering them for inclusion. |
This seems like a reasonable addition. I would say that this should be on the general type This can be done in two phases:
We should do (1) in the short term, before 1.0, since it affects data file stability. This should be a fairly easy change to make. We can do (2) later, since it only affects runtime behavior. |
So I'm very happy to have (1), I'm not as sure if we should have U+FFFD validation behavior. It seems a bit strange to me and rather specific, given that zerovec tends to avoid GIGO overall. |
I split (1) into a separate issue in the 1.0 Polish milestone: #1970 We can keep this issue open for a future milestone to further discuss (2). |
sounds good! |
Thanks. 3 bytes per |
(I would also take a |
(as we've mentioned before this is entirely dependent on the caching strategy/etc used but we do plan to support the ability to do this) |
the u24 can be done without any |
I implemented |
I think it'd be confusing for Wouldn't it be cleaner to define another type like |
3 bytes in enough for any
It would make sense to have a |
Oh right it doesn't use UTF-8 internally. However, doesn't making |
This is not something that ZeroVec has |
Discussion with @robertbastian @Manishearth @sffc:
|
Please provide a type in
zerovec
that has these properties:char
.char
such that if the value is outside the scalar value range, panics if debug assertions are enabled and returns U+FFFD if debug assertions are not enabled.Use case: Storing non-BMP normalization expansions without wasting a byte per scalar value.
The text was updated successfully, but these errors were encountered: