Skip to content

GC experimentation #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kripken opened this issue Apr 30, 2018 · 18 comments
Closed

GC experimentation #89

kripken opened this issue Apr 30, 2018 · 18 comments

Comments

@kripken
Copy link

kripken commented Apr 30, 2018

The wasm GC spec is making progress, initial experimental implementations are starting to be worked on in VMs, and there is experimentation in AssemblyScript too. It would be great to be able to connect all these things! That is, to know what the earliest testable GC things in VMs are going to be, so that we can emit them from AssemblyScript and then see that code running in a VM. Then we'd be testing the whole toolchain->VM pipeline.

A possible vague plan:

  • Figure out the right GC things to target, that will be testable soon.
  • Implement those things in Binaryen, defining new APIs as needed.
  • Use those new APIs in AssemblyScript.

(Personally I'd love to help out with this, on the Binaryen parts.)

@kripken
Copy link
Author

kripken commented Apr 30, 2018

One question I have about this, for @dcodeIO: perhaps you can give an overview of what the GC plans are here? (Sorry if this is on the wiki, I couldn't see it.) Specifically, how do you intend to use GC - is it for external JS things, or also for AssemblyScript objects themselves, or something else?

And will it be an optional flag that any AssemblyScript code can use, or will source code be written somewhat differently for it?

@dcodeIO
Copy link
Member

dcodeIO commented Apr 30, 2018

perhaps you can give an overview of what the GC plans are here?

Plan so far was to implement a makeshift GC as a placeholder while the GC spec is still in the works, but given that there is something we can test with on the horizon, I am considering skipping a custom GC and instead focusing on integrating with WASM GC right away, eventually making it the primary implementation in AssemblyScript (for internal and external objects).

And will it be an optional flag that any AssemblyScript code can use, or will source code be written somewhat differently for it?

There is already a mechanism for retaining the non-GC behavior with @unmanaged classes and everything not annotated as unmanaged could use WASM GC in the future. Breaking anything not annotated this way is expected and fine for me. Necessary changes can simply live in its own branch until support lands in VMs.

Maybe a little shopping list for what would be necessary on the Binaryen side to be utilized by the AssemblyScript compiler:

Classes

  • Tuple types with typed elements ((type $XY (struct ... ))), represented by a typed runtime reference ((ref $XY))
  • Instructions to load ((load_field ...)) and store fields ((store_field ...)) using the runtime reference
  • Instruction to instantiate a new instance ((new_default_struct ...))

Arrays

  • Array types with (packed) elements ((type $XY (array ...))) with flexible (!) size, represented by a typed runtime reference ((ref $XY))
  • Instructions to load ((load_elem ...)) and store elements ((store_elem ...)), by index, using the runtime reference
  • Instruction to instantiate a new instance ((new_default_array ...))

With the above in place it should be possible to implement managed classes and arrays by throwing away unnecessary parts of the standard library (currently has code to do indexed access through operator overloads for example) and re-implement on top of GC in the compiler, i.e. indexed array access, property accesses on class instances and changing linear allocation to new_default_struct etc. We can start with just the basics here because more complex concepts like interfaces are not yet supported anyway (though extending a class by means of adding additional fields is).

I hope I didn't miss something obvious here, but if I did or failed to clarify some parts, feel free to mention it :)

P.S: Only prequesite to get started here on my side is to finish this PR. Nothing specific on the roadmap after that.

@kripken
Copy link
Author

kripken commented May 1, 2018

Thanks @dcodeIO !

From talking with @lukewagner (Luke, correct me if I got something wrong), the first step here for VMs is to implement the "reference types proposal", which define some new types and new instructions. This may already be ready for experimentation? However, it is not enough for the AssemblyScript stuff described above like creating a new object and getting and setting fields, instead, if I understand the spec it just adds

  • A few new fixed types like "anyfunction" (which is any function instead of "anyref" which is anything).
  • Reference instructions which can compare refs.
  • Table operations like read and write refs to/from tables.

In other words this is a small step towards full GC support, but only provides table and related operations.

Would reference types be interesting to experiment with in AssemblyScript? Should be quick to add to Binaryen if so. Or would we need to wait for full GC support?

@dcodeIO
Copy link
Member

dcodeIO commented May 1, 2018

I see, thanks. From my point of view, reference types seem like something that might be good to implement as a new basic type, ref. Having that, we could then import/exports values of this type and implement its equality check (ref.eq), including == null (ref.isnull) with null in ref contexts being ref.null instead of zero. Should be relatively straight forward to implement, and doesn't hurt to have already as it doesn't interfere with existing things.

Regarding the table instructions I don't yet see an immediate use case in the language, but there's always the option to add temporary builtins emitting the operations we'd like to test, here get_table, set_table etc. Should be rather straight forward as well.

@kripken
Copy link
Author

kripken commented May 2, 2018

Yeah, I don't really see the table instruction use case either. Overall, it doesn't seem like that much is possible with just the reference-types proposal, and we need to wait for full GC. @lukewagner, am I missing something?

@MaxGraey
Copy link
Member

MaxGraey commented May 2, 2018

Hmm, @dcodeIO you want use wasm GC for external and internal purposes? Is it mean all internal refs should be stored in table? Sorry, if I misunderstood something, I'm not deeply familiar with GC spec =)

@lukewagner
Copy link

The use case for tables-of-refs is when you want to logically store a reference in linear memory. Then you stick it into a table and store its i32 index in linear memory. This requires having a lifetime for the table element so that you can free up the element and avoid leaking the object forever. E.g., see how Rust's wasm-bindgen uses it.

For AssemblyScript, since the goal (I assume?) is for all objects to be allocated in the GC heap (not linear memory), there wouldn't be any need for this rooting so I agree there wouldn't be any need.

So yeah, I'd start with an anyref type that could point to any typed or untyped JS object. One interesting extension which I think would be useful in the long-term in any case would be to have a syntax for doing dynamic dictionary object access on anyrefs. Initially this would be implemented by:

  • creating an anyref holding the property name string (either once up-front for obj.prop or dynamically for obj[prop])
  • importing and calling Reflect.get passing obj and prop

This impl strategy will actually be somewhat slow (compared to native JS property access), but we can optimize it in various ways over time (both in engines, and possibly in the standard via a special get-prop Host Binding).

@dcodeIO
Copy link
Member

dcodeIO commented May 2, 2018

@MaxGraey I think that spec'ed-features should become the default (best interoperability between different languages, smallest binaries), but we can of course just rename the current native implementation of arrays to NativeArray for example and make it configurable on compilation similar to how --use Math=NativeMath works (in case of array, there'd be JSArray and NativeArray then). The challenge there will be to make both kinds of arrays largely interoperable (see paragraph below), of course. Wdyt?

@lukewagner Thanks for the clarification, makes sense. So tables will actually be important to make structures that live in linear memory interoperable with references that don't. Here, for example, a NativeArray<anyref> (see paragraph above, or later also NativeArray<JSArray>) then stores the table indexes in its backing ArrayBuffer as i32s, and indexed access on such on array would have to wrap the load in a get_table. That'll be interesting* to implement :).

(*) Now I wonder if get_table and set_table in this form will require some sort of reference counting (number of indexes around in linear memory that point to a particular table entry) on the native side to be able to clean up table elements eventually.

@dcodeIO
Copy link
Member

dcodeIO commented May 8, 2018

FYI: There is now support for sign-extension-ops and mutable-global behind feature flags. With this mechanism present, the same can be done for reference-types without the need to keep it in its own branch :)

@vgrichina
Copy link

What is current status of this? Are there any VMs already implementing GC spec? Is there any roadmap on when spec will be finalized?

@dcodeIO
Copy link
Member

dcodeIO commented Jun 6, 2019

Maybe a quick update on this for everyone stumbling upon this issue: From what I can tell, there is no sign of WASM GC on the horizon yet. However, recent commits to master implement a custom MM/GC combination based on reference counting for the time being, as described here.

@MaxGraey
Copy link
Member

MaxGraey commented Nov 8, 2019

Just leave this paper which describe how we could further improve Reference Counting
Lazy_Cyclic_Reference_Counting.pdf

@MaxGraey
Copy link
Member

@MaxGraey
Copy link
Member

MaxGraey commented Dec 19, 2019

Another paper about RC Immix
"High Performance Reference Counting and Conservative Garbage Collection" by Rifat Shahriyar
ThesisShahriyar2015.pdf

@dcodeIO
Copy link
Member

dcodeIO commented Jun 30, 2020

With the GC subgroup having regular meetings meanwhile and initial prototype implementations being pursued in WABT and V8, I've created a follow-up for the necessary Binaryen support here: WebAssembly/binaryen#2935

It's likely that it's still a bit too early, so don't hold your breath.

@tlively
Copy link

tlively commented Jul 14, 2020

If anyone from the AssemblyScript community would be interested in working to implement experimental GC support in Binaryen, I would be happy to help out with guidance and code reviews.

@MaxGraey
Copy link
Member

MaxGraey commented Jan 29, 2021

I guess we could close this. Or keep open until Wasm GC?

@tlively
Copy link

tlively commented Jan 29, 2021

Yes, this work is well under way in Binaryen and supported in V8. I think this general issue can probably be closed.

@dcodeIO dcodeIO closed this as completed Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants