-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Script (and ScriptWithExt) to specific range #1404
Comments
FYI for others,
I agree that the proposal above could address the problem, but I wonder what the utility of this would be -- in other words, how many other "pseudo-property" types are we creating to help support our data structure optimizations by representing bit-fiddly versions of real data? Could such a new bound be applied in other scenarios? -- I can't think of any because newtypes are defined to not have a known upper bound (beyond the max value of the wrapper type) in order to allow for the possibility that future [enumerated, numeric, etc.] values are defined in the future for the property that we can't yet know about in the present. So if we only have |
My understanding is that more script codes can be introduced in the future, right? Or are you considering introducing bounds at the 1024 level which is (i guess?) the absolute max available. I'm kinda ambivalent, I like the idea of having stricter validation here but I don't want to make things harder to use. I'm not really sure if this makes things harder to use, though. |
@markusicu says there is an intrinsic maximum of around 500 script codes that could possibly be encoded, and that 1024 is more than we should ever need. |
2 bits for the metadata plus 10 bits for the script code is only 12 out of 16 bits, so we have 4 bits of leeway here. Given that our current margin of safety is already 2x the intrinsic maximum, I don't think there's a huge return on investment for adding type-system restrictions here. At most, I would consider moving the 2 bits of metadata up to the highest bits and officially reserving the 4 bits in between, so that they can be used as either script code or metadata if the future surprises us. |
@echeran deliberately used only the lower 12 bits of the u16 value, so that if in the future we add a 12-bit value width to the code point trie, the ScriptWithExt values need not be re-encoded. So unless there is a specific benefit, I wouldn't move the special bits up. A CPT with 12-bit value width would be relatively easy, but trading off size vs. performance because reading a data value would require reading/combining two bytes and applying some bit masking/shifting, rather than just reading a u8/u16/u32. |
Currently both
Script
andScriptWithExt
usetype ULE = PlainOldULE<2>
. However,ScriptWithExt
assumes thatScript
is < 1024. We should consider strengthening the requirements onScript
andScriptWithExt
.One way to do this would be to:
u16
field private, and ensure it is always in range when constructingBoundedBytesULE<const N: usize, const L: usize, const U: usize>
that bounds a ULE integer to a specific inclusive range between L and UScript
andScriptWithExt
use that new common ULE typeAdvantages of making this change:
Script
andScriptWithExt
Disadvantages:
Needs feedback from:
The text was updated successfully, but these errors were encountered: