ByteString implementation #148

fzhinkin · 2023-06-23T13:20:46Z

Implemented ByteString - an immutable wrapper around byte sequence and added base support to the core IO library.

Missing stuff:

Module documentation

Closes #133

bytestring/src/commonMain/kotlin/UnsafeByteStringOperations.kt

whyoleg

Some minor questions, mainly regarding API parity with stdlib String/ByteArray

bytestring/build.gradle.kts

bytestring/src/commonMain/kotlin/Annotations.kt

bytestring/src/commonMain/kotlin/ByteString.kt

bytestring/src/commonMain/kotlin/ByteStringBuilder.kt

bytestring/src/jvmMain/kotlin/ByteStringJvmExt.kt

core/common/src/ByteStringExt.kt

bytestring/common/src/ByteString.kt

bytestring/common/src/ByteStringBuilder.kt

bytestring/common/src/unsafe/UnsafeByteStringOperations.kt

bytestring/jvm/src/ByteStringJvmExt.kt

fzhinkin · 2023-06-27T14:13:22Z

bytestring/common/src/ByteString.kt

+ * @param string the string to be encoded.
+ */
+public fun ByteString.Companion.fromUtf8String(string: String): ByteString {
+    return wrap(string.encodeToByteArray())


encodeToByteArray behaves differently on JVM and Native when encoding invalid code points.

It may be fine, but primitives from the code module behave similarly on all targets (and the behavior is the same as encodeToByteArray's on JVM).

So we have to either hoist UTF-8 support from the code module into a separate module and reuse it here or change the behavior in the core module to match stdlib's behavior.

Either way, I'd rather postpone a fix and discuss it separately.

BTW, does it make sense to create an YouTrack issue on Kotlin side mentioning this inconsistency? Specifically in scope of ongoing efforts on stabilising K/N runtime/stdlib APIs and behaviour?

Note: K/JS behaviour of encodeToByteArray is the same as in K/N (not sure about K/WASM). In that case only K/JVM behaviour is differs from other platforms.
And now in kotlinx-io we adopting K/JVM behaviour.
While it could be fine, it's a little inconsistent

bytestring/common/src/ByteStringBuilder.kt

bytestring/common/test/ByteStringBuilderTest.kt

core/common/src/ByteStringExt.kt

core/common/test/AbstractSourceTest.kt

bytestring/jvm/test/ByteStringJvmTest.kt

shanshin · 2023-06-28T07:43:06Z

bytestring/common/src/ByteString.kt

+    }
+
+    /**
+     * Compares a byte sequence wrapped by this byte string to a byte sequence wrapped by [other]


It is worth adding either a more detailed description or examples of exactly how the comparison works.

If I am not mistaken, the lexicographic order does not strictly define the relation of values such as (1, 2, 3) ? (1, 3)

I'll mention that the behavior is similar to String::compareTo and then add a sample comparing different byte strings in a separate PR (for #134)

bytestring/common/src/ByteString.kt

qwwdfsad · 2023-06-28T16:40:30Z

core/common/src/BufferExt.kt

+
+import kotlinx.io.bytestring.ByteString
+import kotlinx.io.bytestring.buildByteString
+


Missed it during the original review.
We typically prefer extensions for class Foo either placed in the same file (-> classFooKt) for small and compact additions, or in plural form of the entity -- Foos (Channels.kt, Strings.kt, Serializers.kt). Bit nicer stacktraces, a bit more consistent

We also use @file:JvmMultifileClass for files spread across platforms, but maybe it's a bit too redundant

Thanks for fixing it!

qwwdfsad · 2023-06-28T16:41:14Z

bytestring/build.gradle.kts

+
+        perPackageOption {
+            suppress.set(true)
+            matchingRegex.set(".*unsafe.*")


Do you think it's worth excluding it from the official documentation?

Otherwise, it looks like a Forbidden fruit - we're asking not to use it by any means and then making it visible to everyone.

bytestring/common/src/ByteString.kt

bytestring/common/src/unsafe/Annotations.kt

bytestring/common/src/unsafe/UnsafeByteStringOperations.kt

Currently, we use ByteString.decodeToString and String.encodeToByteString as names for conversion methods between String and ByteString, where non-parameterized functions convert to/from UTF-8, and JVM-specific extensions accept Charset. At the same time, the core module uses different naming for methods reading/writing UTF-8 string and methods reading/writing strings using specific Charset: - Source.readUtf8/Sink.writeUtf8 to work with UTF-8 strings - Source.readString/Sink.writeString to work with strings using the given Charset on JVM. The naming is inconsistent and it seems reasonable to unify read/write methods naming with what we have for ByteString. We can use readString/writeString w/o charset for UTF-8 strings (as these are treated as default in the library) and same-titled JVM-specific extensions accepting a Charset.

fzhinkin added this to the 0.2.0 milestone Jun 23, 2023

fzhinkin force-pushed the prototype-preview-byte-strings branch 2 times, most recently from eafb2dc to 6a535f8 Compare June 23, 2023 13:57

qwwdfsad reviewed Jun 23, 2023

View reviewed changes

bytestring/src/commonMain/kotlin/UnsafeByteStringOperations.kt Outdated Show resolved Hide resolved

fzhinkin requested a review from shanshin June 23, 2023 14:20

fzhinkin self-assigned this Jun 23, 2023

fzhinkin linked an issue Jun 23, 2023 that may be closed by this pull request

Implement ByteStrings #133

Closed

fzhinkin force-pushed the prototype-preview-byte-strings branch 2 times, most recently from 08c8f2f to 77f27a8 Compare June 23, 2023 14:34

fzhinkin marked this pull request as ready for review June 26, 2023 12:44

whyoleg reviewed Jun 26, 2023

View reviewed changes

fzhinkin force-pushed the prototype-preview-byte-strings branch from 4684c07 to 43a9e76 Compare June 26, 2023 15:27

qwwdfsad requested changes Jun 27, 2023

View reviewed changes

Base automatically changed from prototype-preview-trimmed-down-api to prototype-preview June 27, 2023 10:39

fzhinkin force-pushed the prototype-preview-byte-strings branch from 9b44bd9 to 30e76b0 Compare June 27, 2023 11:52

fzhinkin commented Jun 27, 2023

View reviewed changes

fzhinkin force-pushed the prototype-preview-byte-strings branch from 30e76b0 to 7702cbd Compare June 27, 2023 15:13

shanshin reviewed Jun 28, 2023

View reviewed changes

fzhinkin added 12 commits June 28, 2023 10:21

Implement basic ByteString API

d6c533f

Support ByteString in kotlinx-io

41b1962

Cleanup

40f59ed

Update API dump

2d81743

Added tests on ByteStringBuilder

d99ff1c

Improve test coverage

6d3bd05

Removed debugging code

e09f188

Change ByteString::toString to always return full string representation

09e8676

Restructure unsafe API, move utf-8 conversion to base module

9c8d009

Hide ByteString.EMPTY

4f31909

Cleanup

5a1b06f

Added module description

9eff7c8

fzhinkin added 14 commits June 28, 2023 10:22

Move Unsafe Api annotation to unsafe package

7d16db1

Implement contentEquals

9519b83

Cleanup

c05ac7a

Ended the sentence

18620a3

Bump up dependencies version

045cbe1

Add isNotEmpty extension

f2d820a

Add buildByteString functions

09af0ac

Add append(vararg Byte) extension to the builder

fc853c7

Update API dump

74fcebe

Restructure project layout

e2f6d76

Minor API changes

d83827c

Enable JS target in bytestring module

1ffce7a

Fixed formatting for byte-string related tests

a21ec1f

Updated tests, docs and made the code BCE-friendly

5e0fb14

fzhinkin force-pushed the prototype-preview-byte-strings branch from 7702cbd to 5e0fb14 Compare June 28, 2023 09:08

Rename string conversion routines

7207c08

fzhinkin requested a review from qwwdfsad June 28, 2023 09:39

Increase timeout of JS tests

b3612bd

fzhinkin changed the base branch from prototype-preview to develop June 28, 2023 15:29

qwwdfsad requested a review from shanshin June 28, 2023 16:30

qwwdfsad requested changes Jun 28, 2023

View reviewed changes

Cleaned up the code, updated docs

9006255

fzhinkin requested a review from qwwdfsad June 28, 2023 18:14

fzhinkin added 3 commits June 28, 2023 21:39

Updated API dump

2c1a785

Updated exception messages

61332f6

shanshin approved these changes Jun 29, 2023

View reviewed changes

fzhinkin merged commit 19db7f2 into develop Jun 30, 2023

fzhinkin deleted the prototype-preview-byte-strings branch June 30, 2023 08:06


		import kotlinx.io.bytestring.ByteString
		import kotlinx.io.bytestring.buildByteString

ByteString implementation #148

ByteString implementation #148

Uh oh!

Conversation

fzhinkin commented Jun 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

whyoleg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fzhinkin commented Jun 23, 2023 •

edited

Loading