-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nicer API based on a CodePoint
class
#18
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I would ship this as the regular and only library. It's the superior API!
kotlin-codepoints-deluxe/src/commonMain/kotlin/CodePointSequence.kt
Outdated
Show resolved
Hide resolved
/** | ||
* `true` if this code point is a surrogate code unit. | ||
*/ | ||
val isSurrogate: Boolean | ||
get() = !isSupplementary && value.toChar().isSurrogate() | ||
|
||
/** | ||
* `true` if this code point is a high surrogate code unit. | ||
*/ | ||
val isHighSurrogate: Boolean | ||
get() = !isSupplementary && value.toChar().isHighSurrogate() | ||
|
||
/** | ||
* `true` if this code point is a low surrogate code unit. | ||
*/ | ||
val isLowSurrogate: Boolean | ||
get() = !isSupplementary && value.toChar().isLowSurrogate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would these be used for? They seem to go against my intuition of what a code point is since I thought they were exclusively properties of characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of the value of a code point as the index into the Unicode data table. Surrogate characters are part of that table, too.
Most of the time we don't care about surrogate pairs and want to get the code point encoded by the two surrogate code points. But occasionally you might be interested in dealing with surrogate characters on their own. Maybe we're missing a combineWith()
method that returns the value of CodePoints.toCodePoint(high, low)
when called like high.combineWith(low)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Unicode standard contains this:
Not all assigned code points represent abstract characters; only Graphic, Format, Control and Private-use do. Surrogates and Noncharacters are assigned code points but are not assigned to abstract characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know!
Uses the basic functionality in `kotlin-codepoints` to provide a nicer API to work with Unicode code points.
This PR adds
kotlin-codepoints-deluxe
(I'm open to suggestions for a better name), a separate library built on top ofkotlin-codepoints
.