GC unit test infra, start adding GC tests #2573

osa1 · 2021-06-10T09:38:18Z

This refactors the RTS to allow testing garbage collectors and adds a simple GC
test (more tests to be implemented in a new PR).

Major changes:

New trait Memory is introduced, with a alloc_words method for heap
allocation.
All allocating functions now take a generic Memory argument. Part of this
is necessary for testing the collectors, but this also enables more testing:
we now test more code paths in mark stack implementation.
Copying and mark-compact GC entry points now take these arguments:
- mem: &mut M: memory implementation
- get_hp: callback to get heap pointer
- set_hp: SetHp: callback to set heap pointer
- mem_base: u32: beginning of dynamic heap
- mem_end: u32: end of dynamic heap
- static_roots: SkewedPtr: pointer to the static root array
- closure_table_loc: *mut SkewedPtr: address of the closure table in
  dynamic heap
Monomorphised versions of copying and mark-compact GC entry points added to
be called from compiled code
Copying GC refactored to avoid using grow_memory, it now uses alloc_words
to allocate in to-space.
New crate motoko-rts-macros added. This defines a macro ic_mem_fn which
is used to generate monomorphised versions of allocationg functions, to be
called from compiled code.
Randomized testing in RTS tests now use proptest instead of quickcheck.
proptest is much more flexible than quickcheck, allows easier testing in some
cases.

Perf changes:

Using CanCan backend benchmark as described here:
#2033 (comment)

With copying GC the the benchmark uses 1.5% more instructions. I suspect this
is caused by the change in copying GC to use alloc_words instead of
grow_memory.
With compacting GC the benchmark uses 0.12% less instructions

TODOs:

I had to disable the test big-array-access. It's a failing test, and the
backtrace changes depending on RTS build flags (debug or release).

… test

…ed later

dfinity-ci · 2021-06-21T11:44:06Z

This PR does not affect the produced WebAssembly code.

crusso

I must admit, this PR is kinda large to review now, but hopefully some of the comments are useful....

crusso · 2021-06-29T11:20:32Z

rts/motoko-rts-tests/src/gc.rs

+pub fn test() {
+    println!("Testing garbage collection ...");
+
+    // TODO: Add more tests


Indeed, I wonder if adding some monadic syntax to let you build heap graphs using monadic syntax would be useful for writing tests. Does Rust let you do that easily?

crusso · 2021-06-29T11:24:41Z

rts/motoko-rts-tests/src/gc.rs

+                unsafe {
+                    copying_gc_internal(
+                        &mut heap,
+                        heap_base,
+                        // get_hp
+                        || heap_1.heap_ptr_address(),
+                        // set_hp
+                        move |hp| heap_2.set_heap_ptr_address(hp as usize),
+                        static_roots,
+                        closure_table_address,
+                        // note_live_size
+                        |_live_size| {},
+                        // note_reclaimed
+                        |_reclaimed| {},
+                    );
+                }
+            }
+
+            GC::MarkCompact => {
+                unsafe {
+                    compacting_gc_internal(
+                        &mut heap,
+                        heap_base,
+                        // get_hp
+                        || heap_1.heap_ptr_address(),
+                        // set_hp
+                        move |hp| heap_2.set_heap_ptr_address(hp as usize),
+                        static_roots,
+                        closure_table_address,
+                        // note_live_size
+                        |_live_size| {},
+                        // note_reclaimed
+                        |_reclaimed| {},
+                    );
+                }
+            }
+        }


The arguments look identical here. Can you instead compute the gc function and apply the result to the arguments, sans repetition?

So the problem here is compacting_gc_internal and copying_gc_internal functions have generic parameters, so they're like C++ templates rather than OCaml/Haskell functions with type parameters. We can't return them from functions for two reasons (1) type system not powerful enough for higher rank types, so we can't return a value of type forall x y z . ... (2) the runtime does not support values with type parameters (which would require monomorphising in runtime). I also mentioned this briefly in the GC type, copying here:

/// Enum for the GC implementations. GC functions are generic so we can't put them into arrays or /// other data types, we use this type instead. #[derive(Debug, Clone, Copy)] pub enum GC { Copying, MarkCompact, }

rts/motoko-rts-tests/src/gc/heap.rs

rts/motoko-rts/src/bigint.rs

crusso · 2021-06-29T12:49:12Z

rts/motoko-rts/src/float.rs

-#[no_mangle]
-unsafe extern "C" fn float_fmt(a: f64, prec: u32, mode: u32) -> SkewedPtr {
+#[ic_mem_fn]
+unsafe fn float_fmt<M: Memory>(mem: &mut M, a: f64, prec: u32, mode: u32) -> SkewedPtr {
    // prec and mode are tagged (TODO (osa): what tag???)


Unresolved TODO? @ggreif might know the answers to your questions...

Right, so this code was ported from the C implementation of RTS and I added this TODO to ask about it later and document it. Would be good to know what "tag" means here and document it.

https://github.com/dfinity/motoko-base/blob/927119e172964f4038ebc7018f9cc1b688544bfa/src/Float.mo#L79 might give a clue.

crusso · 2021-06-29T12:54:00Z

rts/motoko-rts/src/gc/copying.rs

+    let reclaimed = (end_from_space - begin_from_space) - (end_to_space - begin_to_space);
+    note_reclaimed(Bytes(reclaimed as u32));
+
+    // Copy to-space to the beginning of from-space


I'm curious: I wonder why we do this rather than just switching the role of the two space and from space. Is that to accommodate a memory.grow on the next round of mutation and avoid interpreting pointers as offset relative to the current to_space base?

Is that to accommodate a memory.grow on the next round of mutation and avoid interpreting pointers as offset relative to the current to_space base?

It is to allow bump allocation until the end of Wasm heap without loss of efficiency. If keep using to-space we will have less room to bump the heap pointer until end of Wasm heap. Before copying GC the heap is like this:

| static heap | dynamic heap |........(unallocated)...........| ^ ^ ^ ^ 0 heap base heap pointer end of Wasm heap (allocation pointer) (4GiB)

Allocation is done by bumping the heap pointer until it reaches end of Wasm heap, and calling memory.grow on the way to allocate Wasm pages when needed.

After copying live data to to-space it looks like this:

| static heap | from-space |to-space |....(unallocated)....| ^ ^ ^ ^ ^ 0 heap base to-space heap pointer end of Wasm heap

Now if we don't move to-space back to from-space we'll have much less space to allocate (the "unallocated" part).

We could implement a slow path for allocation that, when hp == end of Wasm heap, checks if we have any space between heap base and beginning of current space. Or more generally, we could make allocation area a list of ranges on the heap (though I think we will need at most two ranges in this list), so handle cases like:

| static heap | to-space | from-space | to-space | ^ end of Wasm heap

Which becomes this after GC:

| static heap | alloc area | live data | alloc area | ^ end of Wasm heap

Yeah, all makes sense, thanks for the explanation!

rts/motoko-rts/src/gc/copying.rs

rts/motoko-rts/src/memory.rs

@1

Originally discussed here: #2573 (comment) Previously grow_memory would allocate one more page than necessary when the argument is a multiple of Wasm page size. New calculation fixes that. The code is basically (ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in u64 to avoid overflows in `ptr` is larger than `u32::MAX - WASM_PAGE_SIZE + 1` Diff of mo-rts.wasm before and after: --- mo-rts.wat 2021-06-30 08:50:10.319312072 +0300 +++ ../../motoko_2/rts/mo-rts.wat 2021-06-30 08:50:08.507322577 +0300 @@ -18760,12 +18760,14 @@ (func $motoko_rts::alloc::alloc_impl::grow_memory::hd1a4f99ad46d13e9 (type 7) (param i32) block ;; label = @1 local.get 0 - i32.const 16 - i32.shr_u + i64.extend_i32_u + i64.const 65535 + i64.add + i64.const 16 + i64.shr_u + i32.wrap_i64 memory.size i32.sub - i32.const 1 - i32.add local.tee 0 i32.const 1 i32.lt_s So two extra instructions. CanCan backend benchmark reports 0.004% increase in cycles (42,055,656 to 42,057,358, +1,702).

Co-authored-by: Claudio Russo <claudio@dfinity.org>

Originally discussed here: #2573 (comment) Previously grow_memory would allocate one more page than necessary when the argument is a multiple of Wasm page size. New calculation fixes that. The code is basically (ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in u64 to avoid overflows in `ptr` is larger than `u32::MAX - WASM_PAGE_SIZE + 1` Also moves `total_pages - current_pages` to the slow path to improve performance slightly. CanCan backend benchmark reports 0.002% increase in cycles.

rts/motoko-rts-tests/src/gc/utils.rs

crusso

Curious why you got rid of ic_fn, but LGTM

osa1 · 2021-06-30T12:30:54Z

Curious why you got rid of ic_fn, but LGTM

It wasn't as useful as ic_mem_fn. What it did was it converted this

#[ic_fn]
fn blah() { ... }

into

// original function, to be used in tests
fn blah(...) { ... }

// wrapper to be called from generated code
#[cfg(feature = "ic")]
#[no_mangle]
fn ic_blah(...) { blah(...) }

I can do the same without using the macro by adding cfg(...) and no_mangle manually:

#[cfg_feature = "ic"]
#[no_mangle]
fn blah() { ... }

and use it in both tests and from generated code.

Accidental renaming was done in #2573 while renaming the "Heap" trait to "Memory" and "heap" identifiers to "mem".

In #2573 (0f55c58) I omitted creating creating closure tables and checking them after GC, implementing it now.

osa1 added 26 commits June 8, 2021 16:05

WIP

6cf563e

WIP refactoring

0c18d9c

Refactor type signatures to reduce duplication

8da077d

Refactor dump functions to allow use in tests

02543c8

Fix a few bugs, start printing the heap (kinda works)

b62a64e

Fixing bugs

f25baca

Delete old code

3ef2717

The heap is now recognized by the RTS

1b00957

Start calling GC -- does not work yet, need to implement grow_memory

0799696

Fix a few more bugs, improve debug prints, run GC a few times just to…

d062634

… test

Minor refactoring, clarify "address" and "offset"

03beefb

Implement dynamic heap sanity check

ae4bafd

Minor refactoring: simplify word reads/writes

2a2ca9e

Check heap after each GC

a870a5e

Merge remote-tracking branch 'origin/master' into osa1/rts_tests

4c995b1

Merge remote-tracking branch 'origin/master' into osa1/rts_tests

9784923

Post-merge fixups

2114514

WIP

030ab98

Introduce Heap trait

05f2cec

Commit WIP refactoring

9656aff

Simplify wrapper functions

9e7b421

Replace rest of no_mangles with ic_fn

fbce72c

Merge remote-tracking branch 'origin/master' into osa1/rts_tests

5808b43

Fix compile errors, add a hack to Cargo.toml for vendoring, to be fix…

409ecbe

…ed later

Gradually fixing compile errors -- WIP

28d2f61

More refactoring, move GC methods out of Heap trait

38ba5ee

osa1 added 3 commits June 22, 2021 11:12

Start using heap trait in tests, enable bitmap tests

79190d1

Enable closure_table tests

9a91c38

Enable BigInt tests

3383c09

crusso approved these changes Jun 29, 2021

View reviewed changes

osa1 added 3 commits June 29, 2021 16:48

Fix warnings when building tests

9b0d209

Remove ic_fn macro, not sure if it's worth adding more magic

238b982

Merge remote-tracking branch 'origin/master' into osa1/rts_tests

974d118

osa1 mentioned this pull request Jun 30, 2021

RTS: fix pages calculation in grow_memory #2633

Merged

osa1 and others added 6 commits June 30, 2021 10:47

Apply suggestions from code review

01bede9

Co-authored-by: Claudio Russo <claudio@dfinity.org>

Add functions for making scalar and pointers, minor refactoring

f75cd79

Replace hard-coded header size constants with size_of

4e95bd5

Update ObjectIdx documentation

543ed46

Remove invalid TODO

cac605d

Document ic_mem_fn

003f07a

osa1 added 3 commits June 30, 2021 13:42

Merge remote-tracking branch 'origin/master' into osa1/rts_tests

3f88699

Remove a few TODO comments

8d68639

Merge two #[allow] attributes

1fa43a8

crusso reviewed Jun 30, 2021

View reviewed changes

rts/motoko-rts-tests/src/gc/utils.rs Show resolved Hide resolved

crusso approved these changes Jun 30, 2021

View reviewed changes

osa1 added the automerge-squash When ready, merge (using squash) label Jun 30, 2021

mergify bot merged commit 0f55c58 into master Jun 30, 2021

mergify bot deleted the osa1/rts_tests branch June 30, 2021 12:47

mergify bot removed the automerge-squash When ready, merge (using squash) label Jun 30, 2021

osa1 added a commit that referenced this pull request Jul 1, 2021

Fix a typo and accidental renaming

317c4d8

Accidental renaming was done in #2573 while renaming the "Heap" trait to "Memory" and "heap" identifiers to "mem".

osa1 mentioned this pull request Jul 1, 2021

Fix a typo and accidental renaming #2638

Merged

mergify bot pushed a commit that referenced this pull request Jul 1, 2021

Fix a typo and accidental renaming (#2638)

4239e28

Accidental renaming was done in #2573 while renaming the "Heap" trait to "Memory" and "heap" identifiers to "mem".

osa1 added a commit that referenced this pull request Jul 1, 2021

GC tests: start creating closure table

7714fc4

In #2573 (0f55c58) I omitted creating creating closure tables and checking them after GC, implementing it now.

osa1 mentioned this pull request Jul 1, 2021

GC tests: start creating closure table #2639

Merged

mergify bot pushed a commit that referenced this pull request Jul 1, 2021

GC tests: start creating closure table (#2639)

1970e92

In #2573 (0f55c58) I omitted creating creating closure tables and checking them after GC, implementing it now.

ggreif mentioned this pull request Jul 8, 2021

build: bump moc 0.6.1 -> 0.6.5 dfinity/sdk#1747

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GC unit test infra, start adding GC tests #2573

GC unit test infra, start adding GC tests #2573

osa1 commented Jun 10, 2021 •

edited

Loading

dfinity-ci commented Jun 21, 2021

crusso left a comment

crusso Jun 29, 2021

crusso Jun 29, 2021

osa1 Jun 30, 2021

crusso Jun 29, 2021

osa1 Jun 30, 2021

crusso Jun 30, 2021

crusso Jun 29, 2021

osa1 Jun 30, 2021

crusso Jun 30, 2021

crusso left a comment

osa1 commented Jun 30, 2021

GC unit test infra, start adding GC tests #2573

GC unit test infra, start adding GC tests #2573

Conversation

osa1 commented Jun 10, 2021 • edited Loading

dfinity-ci commented Jun 21, 2021

crusso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crusso left a comment

Choose a reason for hiding this comment

osa1 commented Jun 30, 2021

osa1 commented Jun 10, 2021 •

edited

Loading