Add nicer API based on a `CodePoint` class #18

cketti · 2023-01-27T22:32:58Z

This PR adds kotlin-codepoints-deluxe (I'm open to suggestions for a better name), a separate library built on top of kotlin-codepoints.

JakeWharton

Honestly I would ship this as the regular and only library. It's the superior API!

kotlin-codepoints-deluxe/src/commonMain/kotlin/CodePoint.kt

kotlin-codepoints-deluxe/src/commonMain/kotlin/StringExtensions.kt

kotlin-codepoints-deluxe/src/commonMain/kotlin/CodePointSequence.kt

JakeWharton · 2023-01-30T14:14:39Z

kotlin-codepoints-deluxe/src/commonMain/kotlin/CodePoint.kt

+    /**
+     * `true` if this code point is a surrogate code unit.
+     */
+    val isSurrogate: Boolean
+        get() = !isSupplementary && value.toChar().isSurrogate()
+
+    /**
+     * `true` if this code point is a high surrogate code unit.
+     */
+    val isHighSurrogate: Boolean
+        get() = !isSupplementary && value.toChar().isHighSurrogate()
+
+    /**
+     * `true` if this code point is a low surrogate code unit.
+     */
+    val isLowSurrogate: Boolean
+        get() = !isSupplementary && value.toChar().isLowSurrogate()


What would these be used for? They seem to go against my intuition of what a code point is since I thought they were exclusively properties of characters.

I think of the value of a code point as the index into the Unicode data table. Surrogate characters are part of that table, too.

Most of the time we don't care about surrogate pairs and want to get the code point encoded by the two surrogate code points. But occasionally you might be interested in dealing with surrogate characters on their own. Maybe we're missing a combineWith() method that returns the value of CodePoints.toCodePoint(high, low) when called like high.combineWith(low).

The Unicode standard contains this:

Not all assigned code points represent abstract characters; only Graphic, Format, Control and Private-use do. Surrogates and Noncharacters are assigned code points but are not assigned to abstract characters.

Good to know!

Uses the basic functionality in `kotlin-codepoints` to provide a nicer API to work with Unicode code points.

Create kotlin-codepoints subproject

39abf06

JakeWharton reviewed Jan 28, 2023

View reviewed changes

kotlin-codepoints-deluxe/src/commonMain/kotlin/CodePoint.kt Outdated Show resolved Hide resolved

kotlin-codepoints-deluxe/src/commonMain/kotlin/StringExtensions.kt Show resolved Hide resolved

cketti force-pushed the value_class branch from 51c256b to 12b984d Compare January 29, 2023 14:01

JakeWharton reviewed Jan 30, 2023

View reviewed changes

Add kotlin-codepoints-deluxe

1aaf04a

Uses the basic functionality in `kotlin-codepoints` to provide a nicer API to work with Unicode code points.

cketti force-pushed the value_class branch from 12b984d to 1aaf04a Compare January 30, 2023 15:23

cketti merged commit 1b30575 into main Jan 31, 2023

cketti deleted the value_class branch January 31, 2023 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nicer API based on a `CodePoint` class #18

Add nicer API based on a `CodePoint` class #18

cketti commented Jan 27, 2023

JakeWharton left a comment

JakeWharton Jan 30, 2023

cketti Jan 30, 2023

cketti Jan 30, 2023

JakeWharton Jan 30, 2023

Add nicer API based on a CodePoint class #18

Add nicer API based on a CodePoint class #18

Conversation

cketti commented Jan 27, 2023

JakeWharton left a comment

Choose a reason for hiding this comment

JakeWharton Jan 30, 2023

Choose a reason for hiding this comment

cketti Jan 30, 2023

Choose a reason for hiding this comment

cketti Jan 30, 2023

Choose a reason for hiding this comment

JakeWharton Jan 30, 2023

Choose a reason for hiding this comment

Add nicer API based on a `CodePoint` class #18

Add nicer API based on a `CodePoint` class #18