-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quick fix cleanup #5
Quick fix cleanup #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I've left a couple of comments for you. I don't see any major issues here. Thank you for cleaning this up 😃
if u32 <= 0x10FFFF then | ||
Ok (fromU32Unchecked u32) | ||
else | ||
Err InvalidCodePoint | ||
|
||
## Returns false if this is either a [high-surrogate code point](http://www.unicode.org/glossary/#high_surrogate_code_point) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to remove these links? I think we should keep these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking external links should remain unique for code maintainability. Internal links can be repeated for internal navigation, and they could ultimately lead back to the original external link.
I'm open to putting them back if you find it necessary.
u32 >= 0xDC00 && u32 <= 0xDFFF | ||
|
||
## Zig docs: bytes the UTF-8 representation would require | ||
## for the given codepoint. | ||
utf8Len : CodePoint -> Result U64 [InvalidCodePoint] | ||
utf8Len : CodePoint -> Result U8 [InvalidCodePoint] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used U64
here as it was recently changed from Nat
. I don't really know myself, but just wondering why you changed it to U8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was one of my doubts and should've asked before. But utf8Len
and countUtf8Bytes
return values in the range of 1 to 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets give it a try. It's always easy to cast if needed
@@ -18,36 +18,34 @@ interface Scalar | |||
] | |||
|
|||
## A [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value) - that is, | |||
## any [code point](./CodePoint#CodePoint) except for [high-surrogate](http://www.unicode.org/glossary/#high_surrogate_code_point) | |||
## and [low-surrogate](http://www.unicode.org/glossary/#low_surrogate_code_point) code points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think it is better to link to the source www.unicode.org
here so that future contributors can easily find the relevant details in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok 👍
Thanks @lukewilliamboswell now I have verified commits
I'm still learning about unicode so I will investigate more to contribute
For now this are some quick fixes