Move the non-IO parts of ceylon.io into a cross-platform module #449

ASzc · 2015-11-02T10:27:57Z

The data representation (Buffers) and text/data conversion (Charsets, Base64) features of ceylon.io could be adapted without much effort (most of the work is adding a pure Ceylon ByteBuffer) to use on the JS backend.

However, one of the limitations of the native annotation in a cross-platform module is that all shared things must implement both backends. As the actual Input/Output parts (FileDescriptors, etc.) of ceylon.io can't be made to work in browsers (though maybe they could work on node.js?), I propose splitting the module.

The new cross-platform module might be called ceylon.binary (that being the common theme), but maybe there is a better name. I don't think many modifications would be required to the remaining ceylon.io code to allow it to import/use ceylon.binary. One thing I did spot was some things relying on underlyingBuffer instead of implementation from the ByteBuffer interface, but that's a quick fix.

I'm willing to work on this myself, if people agree it should be done.

The text was updated successfully, but these errors were encountered:

tombentley · 2015-11-02T10:32:27Z

How about ceylon.buffer? I know that doesn't quite cover the encoding bits, but it seems a little more obvious to me than ceylon.binary.

quintesse · 2015-11-02T10:41:08Z

Btw, there are plans to allow shared native declarations that aren't cross-platform (see ceylon/ceylon-spec#1459) , although personally I'd only use it if the amount of non-crossplatform code would be small in comparison to the rest. Otherwise splitting it off into a separate module might not be a bad idea.

ASzc · 2015-11-02T19:09:48Z

@tombentley yes ceylon.buffer might be better, as "binary" doesn't have an obvious context to it. Buffers are (at least) one side to all operations done in the module (the other side being text/String within the module, and FileDescriptor in ceylon.io).

@quintesse Good, I see that making native more useful. Although my initial approach tried to keep ceylon.io together, I think a new module is clearer, as you say.

I will wait a few more days for further comment, then start working on the PR if there are no objections.

ASzc · 2015-11-11T23:33:29Z

Started working on this here. Not quite ready for a PR yet, one of the tests fails on JavaScript, not sure why.

ASzc · 2015-11-12T20:37:26Z

This may be a bug in the utf16 Charset, rather than the new ByteBuffer implementation. testUTF16Encoder produces:

ceylon.test::AssertionComparisonError "Encoding of z水𐀀𝄞􏿽
expected <[0, 122, 108, 52, 216, 0, 220, 0, 216, 52, 221, 30, 219, 255, 223, 253]>
 but was <[0, 122, 108, 52, 216, 0, 220, 0, 216, 52, 221, 30, 221, 30]>"

This doesn't look like something a ByteBuffer bug would case (and all its own tests pass). I see most of the Charset implementations do some bit manipulation. I know from Svalinn that this doesn't work reliably cross-backend, so that might be the source of the problem.

ASzc · 2015-11-12T21:31:30Z

Similar problem with utf8 and that test string:

ceylon.test::AssertionComparisonError "Encoding of z水𐀀𝄞􏿽
expected <[122, 230, 176, 180, 240, 144, 128, 128, 240, 157, 132, 158, 244, 143, 191, 189]>
 but was <[122, 230, 176, 180, 240, 144, 128, 128, 240, 157, 132, 158, 237, 180, 158]>"

Expected is Python's encoding, generated with:

", ".join("#"+hex(x)[2:] for x in "z水𐀀𝄞􏿽".encode("utf-8"))

The bit manipulations I see in both seem fairly safe; they don't appear to rely on a fixed integerAddressableSize. Any ideas @FroMage?

FroMage · 2015-11-13T09:54:36Z

Well it has to be a bit issue. You should be able to track it down using a small test (that one char that causes issues) and print statements. (or run under the browser's JS debugger).

ASzc · 2015-11-13T20:55:44Z

I added print statements into CharacterBuffer.get and on encode's codepoint value. They're different between JVM and JS, so it's different before it gets to the bit manipulation. Will look deeper at CharacterBuffer...

Printing the String literal and printing CharacterBuffer.array after copyTo:

JVM:

𐀀􏿽􏿽
{ 𐀀, 􏿽, 􏿽 }

JS:

𐀀􏿽􏿽
{ 𐀀, 􏿽, � }

If I swap out string.copyTo(array); for:

for (i->c in string.indexed) {
    array.set(i, c);
}

Then JS returns (the same as JVM):

𐀀􏿽􏿽
{ 𐀀, 􏿽, 􏿽 }

With that change in place, all ceylon.buffer tests pass!

So, it looks like we've discovered a bug in whatever the native JS implementation is for copyTo?

ASzc · 2015-11-14T10:09:19Z

Pull request filed (#457) with the workaround in place.

FroMage added FEATURE m-io labels Nov 2, 2015

FroMage added this to the 1.3 milestone Nov 2, 2015

ASzc mentioned this issue Nov 13, 2015

JS implementation for String.copyTo doesn't copy some esoteric Characters correctly ceylon/ceylon.language#794

Open

CeylonMigrationBot mentioned this issue Nov 14, 2015

JS implementation for String.copyTo doesn't copy some esoteric Characters correctly eclipse-archived/ceylon#5672

Closed

ASzc mentioned this issue Nov 14, 2015

Move the non-IO parts of ceylon.io into ceylon.buffer #457

Merged

ASzc closed this as completed in #457 Feb 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move the non-IO parts of ceylon.io into a cross-platform module #449

Move the non-IO parts of ceylon.io into a cross-platform module #449

ASzc commented Nov 2, 2015

tombentley commented Nov 2, 2015

quintesse commented Nov 2, 2015

ASzc commented Nov 2, 2015

ASzc commented Nov 11, 2015

ASzc commented Nov 12, 2015

ASzc commented Nov 12, 2015

FroMage commented Nov 13, 2015

ASzc commented Nov 13, 2015

ASzc commented Nov 14, 2015

Move the non-IO parts of ceylon.io into a cross-platform module #449

Move the non-IO parts of ceylon.io into a cross-platform module #449

Comments

ASzc commented Nov 2, 2015

tombentley commented Nov 2, 2015

quintesse commented Nov 2, 2015

ASzc commented Nov 2, 2015

ASzc commented Nov 11, 2015

ASzc commented Nov 12, 2015

ASzc commented Nov 12, 2015

FroMage commented Nov 13, 2015

ASzc commented Nov 13, 2015

ASzc commented Nov 14, 2015