-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update property enums to support CodepointTrie #1089
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ULE code looks correct,
18cd006
to
825b92d
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GeneralSubcategoryULE looks about right! Although you need to take in the latest changes to the trait from recent PRs.
GeneralCategory will never be extended See https://www.unicode.org/policies/stability_policy.html
825b92d
to
3c46e87
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
Assuming that CI doesn't have any surprises for me, this version is ready for review. I think this resolves #1071; if there's anything that was supposed to be included that isn't here, let me know. I've gone ahead and converted Script to a newtype-wrapped u16; it's a bit ugly, but manageably so, and the difference between an enum pretending to be a u16 and a u16 pretending to be an enum is small. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks great! Just a few minor things
} | ||
} | ||
|
||
unsafe impl ULE for GeneralSubcategoryULE { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Please add a comment explaining why this impl satisfied the invariants of ULE
(yes, I know that the ZeroVec impls don't consistently do this, but I'm hoping to move towards always documenting this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 and see #1121 for an updated safety checklist for ULE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo Manish's comment
* Make GC exhaustive GeneralCategory will never be extended See https://www.unicode.org/policies/stability_policy.html * Add GeneralSubcategory to represent raw GC data * Implement AsULE for GeneralSubcategory * Cargo fmt * refactor * Make GC repr(u8) * Cargo fmt * Convert Script to an identifier * Implement AsULE for Script * Implement validate_byte_slice instead of parse_byte_slice * Impl From<GeneralSubcategory> for GeneralCategory * Remove default-implemented ULE methods * Add safety comment on GeneralSubcategoryULE impl Co-authored-by: Iain Ireland <iain.i.ireland@gmail.com>
Putting up an initial draft for feedback. This partially resolves #1071: it handles General_Category, but not Script yet.
Points of interest:
[repr(u32)]
and can represent grouped categories likeLetter
. It's useful for queries. The latter is[repr(u8)]
and represents a single specific subcategory (eg TitlecaseLetter). It's primarily useful for output. GeneralSubcategory is what's stored in the codepoint trie.num_enum
as a dependency to automatically deriveGeneralSubcategory::try_from::<u8>
. It is no_std compatible, has 2.5M downloads all time, and has minimal additional dependencies. The alternative would be writing a couple of large match statements by hand, or writing our own proc_macro to do so.@echeran @sffc @Manishearth