Skip to content

Implement StaticArray<T> as a way to have static data #1122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Feb 26, 2020
Merged

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Feb 21, 2020

This PR adds a more efficient variant of arrays, named StaticArray<T> to the standard library that, unlike normal arrays, does not have a level of indirection, hence

  • Has more efficient indexed operator overloads (half of the memory accesses)
  • Yields less noise in text and binaries since one doesn't have to cache .dataStart
  • Has 32 bytes less overhead due to skipping the Array header incl. its GC header
  • Cannot resize automatically, with an interface similar to ReadonlyArray<T> in TS
  • Yet is AOT incompatible to normal Array<T>, ReadonlyArray<T> or ConcatArray<T> in TS due to a different layout in memory, hence uses a new name

From a technical point of view, one can think of it as a raw ArrayBuffer with an attached element type. It's especially useful in low-level code like our standard math library where it eliminates most of the overhead introduced by having no other option than using Array<T> for precomputed lookup tables. There are some other use cases benefiting from it, like the n-body example where getting rid of the level of indirection yields overall better performance, that otherwise can only be achieved by resorting to low-level loads and stores and caching .dataStart.

Also somewhat related to #1121 in that we have more options now how to go forward with .dataStart, potentially replacing it with .buffer + .byteOffset to match the spec more closely.

@dcodeIO dcodeIO changed the title Add a builtin yielding FixedArray<T> Implement const array assertions as a way to have static data Feb 22, 2020
@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

Instead of FixedArray<T> this now implements ReadonlyArray<T>, which is similar but immutable, and binds it to const assertions of the form [1, 2, 3] as const. Of course this has its implications, like we don't yet support anything else than const array literals and we lose the mutability feature of the initially proposed fixed<T> builtin, but it still seems good enough to solve the underlying requirement of being able to add truly static data segments where we need them.

In the standard library, we typically use these together with manual unsafe load/store so the immutability part isn't relevant there anyway, and since ReadonlyArray<T> and FixedArray<T> are layout-compatible, a user can decide to changetype<FixedArray<i32>>(someReadonlyArrayI32).

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

From looking at the code, while as const is fine to have, I'd favor a more concise variant like

const readonlyArray = [1,2,3] as ReadonlyArray<T>;
const fixedArray = [1,2,3] as FixedArray<T>;

respectively

const readonlyArray: ReadonlyArray<T> = [1,2,3];
const fixedArray: FixedArray<T> = [1,2,3];

which, to my surprise, seems to type-check just fine since these classes are structure compatible to Array<T> or perhaps TS has some magic to allow anything extending ReadonlyArray there. And it appears that [1,2,3] as readonly is also a thing in TS.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 22, 2020

Yeah, as const T[] / as const is just more general alternative of as ReadonlyArray<T>. But as const could also apply to records (inline objects)

@MaxGraey
Copy link
Member

btw readonly T[] and ReadonlyArray<T> should be synonyms same as T[] and Array<T>

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

For some reason as const i32[] doesn't type-check for me. Is that expected?

@MaxGraey
Copy link
Member

You're right it should be as readonly i32[] in TS. And as const works only with inferred arrays

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

So it seems that

  • readonly is a type modifier only valid for arrays and tuples (nothing to do with assertion expressions specifically)
  • as const is a special case of an assertion expression, deriving an exact immutable type

From a purist perspective, both of these do not map well to AssemblyScript, which makes me even more favor something explicit like as ReadonlyArray<i32> or as FixedArray<i32> over them.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 22, 2020

Yes, "readonly" and Readonly<T> is just modifiers and simple compile time artefact same as public, protected and etc, but ReadonlyArray<T> is different (special) interface / class.

@MaxGraey
Copy link
Member

In TS:

const arr1 = [1, 2, 3] as ReadonlyArray<number>; // infers as "readonly number[]"
const arr2 = [1, 2, 3] as const; // infers as "readonly [1, 2, 3]" tuple

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

Yeah, that readonly i32[] -> ReadonlyArray<i32> equality worries me the most, since a ReadonlyArray<i32> has a completely different layout than i32[], and TypeScript might expect these to be somewhat compatible due to structural compatibility. Again, a distinct FixedArray would be more correct there, with ReadonlyArray<i32> using actual Array<T> layout. No matter what we decide, it won't be perfect for one reason or another.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2020

Alright, most likely unpopular conclusion: I have a feeling that if we force ReadonlyArray<T> to a different layout than Array<T> now we might eventually wake up to a situation similar in nature to the === inconsistency, for instance because arrays can't be assigned back and forth or are not as interoperable as one would expect in TS once we have a readonly type modifier for example.

Hence I removed ReadonlyArray<T> again, made as const a Not implemented stub (at least we can parse it now) and settled for FixedArray<T> with an as fixed<T> shorthand for convenience. This gives us the functionality we need without taking any risks.

@dcodeIO dcodeIO changed the title Implement const array assertions as a way to have static data Implement FixedArray<T> as a way to have truly static data Feb 22, 2020
@dcodeIO dcodeIO requested a review from MaxGraey February 22, 2020 23:00
Copy link
Member

@MaxGraey MaxGraey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
I'm wondering when we will have own lang server could we use

[1,2,3] as fixed;

instead

[1,2,3] as fixed<i32>;

wdyt?

@jtenner
Copy link
Contributor

jtenner commented Feb 23, 2020

So the final decision is with FixedArray<T>? Count me excited!

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 23, 2020

We can use as fixed, but at this point the corresponding TS type (for compat) would have to be type fixed = any[] or something, losing type information. With our own language server, probably, at the expense of being incompatible with TS semantics.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2020

The more I look at FixedArray, the more I'm afraid it's lead to unnecessary complexity. I think most users will not use it due to restrictions or just ignorance of its existence because it non-standard type. Saving 16 bytes doesn't seem like such a big deal to me either compare to complexity. The only place it can be useful is for our internal static lookup tables. But in this case just ability declaring raw data segments without any extra types and interfaces will be enough.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 25, 2020

There is more than just size this tries to solve. To me it's even more important that it gets rid of the additional layer of indirection that comes from normal arrays being resizable, and by doing so also gets rid of unnecessary globals and locals dealing with .dataStart, reducing quite a bit of noise in generated text and binaries. Also, size saving is more along the lines of 32 bytes (Array GC header + Array data, as 16 bytes is TLSFs granularity). Ofc we can technically save another 16 bytes by making it unmanaged and static only, but by giving it a GC header we gain the ability to use it in safe code, pass it around, and construct it at runtime from non-static data like objects, like for example in n-body where it gets AS perf on par with Rust.

Perhaps another name than FixedArray would make it more clear?

  • StaticArray<T>
  • FastArray<T>
  • FlatArray<T>
  • CArray<T>
  • Memory.Data<T>
  • Data<T> (probably the most Wasm-y)
  • Buffer<T> (unfortunate this would conflict with node)
  • TypedArrayBuffer<T>
  • other ideas?

@MaxGraey
Copy link
Member

Yeah, FixedArray is similar to "slice" type in other languages (including Rust) and it totally makes sense but not sure as explicit data type but rather, as an implicit internally optimizing view for ordinal arrays, hmm... But this require sophisticated optimizations and decisions which that we do not yet have. So maybe we can stop at FixedArray as a compromise solution.

@jtenner
Copy link
Contributor

jtenner commented Feb 25, 2020

My vote is in for FastArray<T> because it looks like it's possible to put managed references in it. I really like that the data type is managed, too.

Is it possible to make a class both Unmanaged and managed at the same time?

Maybe we could compile the static data with @unmanaged and the managed version with another type altogether?

StaticArray<T> and FastArray<T> both together solve slightly different problems.

@jtenner
Copy link
Contributor

jtenner commented Feb 25, 2020

And... Maybe forgive me for suggesting crazy, but now that we have package management, maybe someone could just implement FastArray in their own package? 😊

@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2020

Also possible names:

  • ArrayView<T>
  • SlicedArray<T>
  • Slice<T>
  • View<T>
  • Arr<T>
  • Sequence<T>
  • Seq<T>
  • Segment<T>

I also like StaticArray<T>

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 25, 2020

StaticArray<T> would be fine for me as well, just sad that we can't have as static<T> due to static being a reserved keyword in strict mode. We can still name it StaticArray and remove the convenience helper.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2020

StaticArray<T> perhaps the best due to it contain Array and consistent with TypedArray. But it petty long. Wdyt about Seq<T> (like just raw sequence elements in memory)? Btw this also abstract data type in Haskell

const arr = [[1, 2], [3, 4]] as StaticArray<StaticArray<i32>>;

vs

const arr = [[1, 2], [3, 4]] as Seq<Seq<i32>>;

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 25, 2020

Would prefer Sequence<T> over Seq<T> if we decide for that. The latter looks too abbreviated to me, compared to the names of other standard library elements. Currently leaning into the direction of StaticArray<T> without a convenience helper, though.

@MaxGraey
Copy link
Member

Okay, so let's call it StaticArray =)

@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2020

Btw Typescript has ConcatArray<T> interface which looks like exactly what we need!

@MaxGraey
Copy link
Member

interface for ConcatArray

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 25, 2020

Has similar problems like ReadonlyArray, though, in that some Array methods in TS are typed to receive a ConcatArray, as a form of describing a minimal interface that works with a particular array method. But our FixedArray isn't compatible with Array, so reusing something that is meant to be compatible can and probably will backfire, like ===.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2020

On TS ConcatArray is really special array which used only for declare spread argument types like:

function foo(...args: ConcatArray<i32>) {
}

So it basically arguments in ES5 which also pretty limit kind of standard array and require .slice(0) to convert to normal Array so it could be work for our goal

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2020

It's for example used by TS to describe the argument to Array#concat and ReadonlyArray#concat, both of which would not be able to deal with both Array and FixedArray (or our version of ConcatArray for that matter) alike. What I'm trying to get at is that these interfaces are designed to be somewhat compatible to Array, in the sense of a subset, which isn't the case in AS. In AS, these may have the same methods, but a different layout.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 26, 2020

ConcatArray iuses as Variadic type for concat:
concat(...items: ConcatArray<T>[]): T[]
which is the same as:
concat(...items: ConcatArray<Array<T>>): T[]

TS don't mix Array & ConcatArray. ConcatArray has really special meaning

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2020

But TS would be fine with a user providing our variant of ConcatArray to Array#concat, yet that would be a compile error in AS because #concat needs something with actual Array layout. It's just not safe to hijack these. If we implement ConcatArray or ReadonlyArray, these should be layout-compatible to Array to avoid any potential issues.

@MaxGraey
Copy link
Member

MaxGraey commented Feb 26, 2020

I guess ConcatArray is just ArrayLike onbject with isConcatSpreadable property setted to true but not real Array and shouldn't be compatible with Array just emulate some properties. As well as arguments which is also just ArrayLike object

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2020

I'll go ahead now and rename this abomination of an array to StaticArray fwiw. I hope we can agree on this, and I also hope that the technical aspects mentioned here will somehow go away in the future, but unfortunately are a problem as of today. When that happens, I'll be more than happy to revisit this, but at this point I worry more that we'll rather go stale by means of arguing about details too much than getting anything done.

@MaxGraey
Copy link
Member

sure. It was just suggestions and small brainstorm for alternative point of view

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2020

Updated the opening post with what I hope is a good summary of the choices made :)

@dcodeIO dcodeIO changed the title Implement FixedArray<T> as a way to have truly static data Implement StaticArray<T> as a way to have static data Feb 26, 2020
@dcodeIO dcodeIO merged commit 209c875 into master Feb 26, 2020
@dcodeIO dcodeIO deleted the fixedarray branch March 15, 2020 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants