Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unaligned bit arrays on the JavaScript target #3946

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

richard-viney
Copy link
Contributor

@richard-viney richard-viney commented Dec 3, 2024

Summary 📘

This PR adds support for unaligned bit arrays on the JavaScript target.

In expressions:

  • Arbitrary sized integer segments:
    <<1:4>>
    <<12:9-little>>
    <<12:29-big>>
    <<1234:100-little>>
  • Arbitrary sized bits segments:
    <<<<0xABCD:15>>:bits-10>>

In patterns:

  • Arbitrary sized int segments:
    let assert <<_:7, i:19-little-signed>> = <<0xABCDEF12:26>>
  • Sized and unsized bits segments:
    let assert <<_:7, a:bits-3, b:size(14)-bits, c:bits>> = <<0xABCDEF:24, 0x1234:16>>

There is a warning if the above features are used when gleam.toml specifies a version < v1.8.0.


Implementation Details 🛠️

  • The BitArray class in the prelude now has bitSize, byteSize, and bitOffset fields.
    • The value of any unused high bits in the first byte, and any unused low bits in the final byte, are undefined.
    • Public API for use in FFI code is now: bitSize, byteSize, bitOffset, rawBuffer, byteAt(index).
    • Deprecated APIs that used by existing FFI code: get buffer(), get length(). Using these emits a deprecation warning at runtime, and throws an exception if the bit array isn't fully aligned (i.e. bitOffset == 0 and bitSize is a multiple of 8).
  • JSDoc annotations have been added to allow type-checking by adding // @ts-check to the top of the file.

Implications for @external JavaScript code 🌍

  • Existing JavaScript FFI code that operates on bit arrays needs to be updated.
  • Until this is done, such code will:
    • Emit deprecation warnings at runtime due to use of the deprecated BitArray.length and BitArray.buffer APIs.
    • Throw a runtime exception if using the deprecated APIs on a bit array that isn't fully byte-aligned (which would be pretty much certain to give the wrong result).
  • No existing code breaks because unaligned bit arrays on JavaScript weren't previously possible.

Implications for gleam/stdlib 🤝

  • I have the updates for gleam/stdlib ready, mostly affecting gleam/bit_array. It can only be merged once this PR goes in as its tests won't run on stable Gleam. It may be necessary to run the new stdlib tests on nightly for a short period, with them segregated into their own file so they can be included/excluded depending on the active Gleam version. I'll sort that out once this PR makes it through review.
  • Future stdlib versions that support unaligned bit arrays on JavaScript will work fine on older Gleam versions < 1.8.0, there are no compatibility concerns there.

Testing 🧪

There's certainly some complexity and tricky bitwise operations here, mostly in the JavaScript prelude. The following has been done to ensure correctness:

  • Many new tests added to language_tests.gleam, and test/javascript_prelude.
  • Every path and branch through the code that performs slicing, concatenation, and conversion to ints/floats is covered by at least one test.
  • Extensive fuzzing has been performed on these operations which validated millions of combinations of bit array contents, segment sizes, offsets, endianness, signedness, etc. on JavaScript against the result on the Erlang target.
  • Issues found by the fuzz testing were fixed and added to the language tests and prelude tests.

✨✨✨

@richard-viney richard-viney marked this pull request as ready for review December 3, 2024 12:22
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from 2496b81 to bb69dd9 Compare December 5, 2024 01:45
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from b347126 to 9aa5446 Compare December 13, 2024 09:32
Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, what a fantastic bit of work! Thank you!

I've not digested these changes properly yet, but I have some initial questions from my first review.

There's a lot of public functions in the API of the bit array class now, but generally the aim is to have none. What is the motivation for having these?

Similar for the deprecated functions, we can remove them.

There's quite a lot of code in the class. Could we move them to free-standing functions and have the generated code only use them if absolutely necessary? That would help with JavaScript bundlers performing dead code elimination.

Some existing tests have been removed or changed, why is this? New features shouldn't alter existing tests, it makes it harder to review and to prevent regressions.

Thanks again!

compiler-core/src/javascript/expression.rs Outdated Show resolved Hide resolved
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 6 times, most recently from 544c327 to 0f6fbdb Compare December 23, 2024 00:38
@richard-viney
Copy link
Contributor Author

richard-viney commented Dec 23, 2024

Thanks for taking a look!

There's a lot of public functions in the API of the bit array class now, but generally the aim is to have none. What is the motivation for having these?

These have been reduced down to the following: get rawBuffer(), get bitSize(), get byteSize(), get isWholeBytes(), and byteAt().

byteAt() could potentially be moved to a free function too, but was pre-existing and I've seen it used in JS FFI code, so it's still there for now and I haven't deprecated it.

Similar for the deprecated functions, we can remove them.

I've removed all except the BitArray.length and BitArray.buffer` accessors. Removing these would break most/all JS FFI code that operates on BitArrays, so they currently emit a deprecation warning at runtime if they're used.

There's quite a lot of code in the class. Could we move them to free-standing functions and have the generated code only use them if absolutely necessary? That would help with JavaScript bundlers performing dead code elimination.

Done.

Some existing tests have been removed or changed, why is this? New features shouldn't alter existing tests, it makes it harder to review and to prevent regressions.

The diff on the tests looked a bit more complex in places than it actually was so I've moved some things around to make it easier to digest. Some existing tests did need to be tweaked or removed, e.g. those that were testing for compilation errors if unaligned bit arrays were used on the JS target.

@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch from 0f6fbdb to 25d0be4 Compare December 23, 2024 00:44
@richard-viney
Copy link
Contributor Author

richard-viney commented Dec 23, 2024

Also, if you could weigh in on the question at the end of the initial writeup about whether we should make bit array slices O(1) in all cases that would be helpful, because if that's a yes then more work is needed prior to this being ready.

@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 5 times, most recently from 3c7c944 to da179df Compare December 30, 2024 11:26
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch from da179df to 0847df6 Compare December 31, 2024 23:06
@richard-viney richard-viney marked this pull request as draft January 1, 2025 09:27
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 4 times, most recently from 73b12f9 to 96ef26b Compare January 3, 2025 12:05
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch from 96ef26b to 40047c3 Compare January 3, 2025 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants