Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC unit test infra, start adding GC tests #2573

Merged
merged 108 commits into from
Jun 30, 2021
Merged

GC unit test infra, start adding GC tests #2573

merged 108 commits into from
Jun 30, 2021

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Jun 10, 2021

This refactors the RTS to allow testing garbage collectors and adds a simple GC
test (more tests to be implemented in a new PR).

Major changes:

  • New trait Memory is introduced, with a alloc_words method for heap
    allocation.

  • All allocating functions now take a generic Memory argument. Part of this
    is necessary for testing the collectors, but this also enables more testing:
    we now test more code paths in mark stack implementation.

  • Copying and mark-compact GC entry points now take these arguments:

    • mem: &mut M: memory implementation
    • get_hp: callback to get heap pointer
    • set_hp: SetHp: callback to set heap pointer
    • mem_base: u32: beginning of dynamic heap
    • mem_end: u32: end of dynamic heap
    • static_roots: SkewedPtr: pointer to the static root array
    • closure_table_loc: *mut SkewedPtr: address of the closure table in
      dynamic heap
  • Monomorphised versions of copying and mark-compact GC entry points added to
    be called from compiled code

  • Copying GC refactored to avoid using grow_memory, it now uses alloc_words
    to allocate in to-space.

  • New crate motoko-rts-macros added. This defines a macro ic_mem_fn which
    is used to generate monomorphised versions of allocationg functions, to be
    called from compiled code.

  • Randomized testing in RTS tests now use proptest instead of quickcheck.
    proptest is much more flexible than quickcheck, allows easier testing in some
    cases.

Perf changes:

Using CanCan backend benchmark as described here:
#2033 (comment)

  • With copying GC the the benchmark uses 1.5% more instructions. I suspect this
    is caused by the change in copying GC to use alloc_words instead of
    grow_memory.

  • With compacting GC the benchmark uses 0.12% less instructions

TODOs:

  • I had to disable the test big-array-access. It's a failing test, and the
    backtrace changes depending on RTS build flags (debug or release).

@dfinity-ci
Copy link

This PR does not affect the produced WebAssembly code.

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit, this PR is kinda large to review now, but hopefully some of the comments are useful....

pub fn test() {
println!("Testing garbage collection ...");

// TODO: Add more tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I wonder if adding some monadic syntax to let you build heap graphs using monadic syntax would be useful for writing tests. Does Rust let you do that easily?

Comment on lines +236 to +272
unsafe {
copying_gc_internal(
&mut heap,
heap_base,
// get_hp
|| heap_1.heap_ptr_address(),
// set_hp
move |hp| heap_2.set_heap_ptr_address(hp as usize),
static_roots,
closure_table_address,
// note_live_size
|_live_size| {},
// note_reclaimed
|_reclaimed| {},
);
}
}

GC::MarkCompact => {
unsafe {
compacting_gc_internal(
&mut heap,
heap_base,
// get_hp
|| heap_1.heap_ptr_address(),
// set_hp
move |hp| heap_2.set_heap_ptr_address(hp as usize),
static_roots,
closure_table_address,
// note_live_size
|_live_size| {},
// note_reclaimed
|_reclaimed| {},
);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arguments look identical here. Can you instead compute the gc function and apply the result to the arguments, sans repetition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem here is compacting_gc_internal and copying_gc_internal functions have generic parameters, so they're like C++ templates rather than OCaml/Haskell functions with type parameters. We can't return them from functions for two reasons (1) type system not powerful enough for higher rank types, so we can't return a value of type forall x y z . ... (2) the runtime does not support values with type parameters (which would require monomorphising in runtime). I also mentioned this briefly in the GC type, copying here:

/// Enum for the GC implementations. GC functions are generic so we can't put them into arrays or
/// other data types, we use this type instead.
#[derive(Debug, Clone, Copy)]
pub enum GC {
    Copying,
    MarkCompact,
}

rts/motoko-rts-tests/src/gc/heap.rs Outdated Show resolved Hide resolved
rts/motoko-rts-tests/src/gc/heap.rs Outdated Show resolved Hide resolved
rts/motoko-rts-tests/src/gc/heap.rs Outdated Show resolved Hide resolved
rts/motoko-rts/src/bigint.rs Show resolved Hide resolved
#[no_mangle]
unsafe extern "C" fn float_fmt(a: f64, prec: u32, mode: u32) -> SkewedPtr {
#[ic_mem_fn]
unsafe fn float_fmt<M: Memory>(mem: &mut M, a: f64, prec: u32, mode: u32) -> SkewedPtr {
// prec and mode are tagged (TODO (osa): what tag???)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved TODO? @ggreif might know the answers to your questions...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so this code was ported from the C implementation of RTS and I added this TODO to ask about it later and document it. Would be good to know what "tag" means here and document it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let reclaimed = (end_from_space - begin_from_space) - (end_to_space - begin_to_space);
note_reclaimed(Bytes(reclaimed as u32));

// Copy to-space to the beginning of from-space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: I wonder why we do this rather than just switching the role of the two space and from space. Is that to accommodate a memory.grow on the next round of mutation and avoid interpreting pointers as offset relative to the current to_space base?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that to accommodate a memory.grow on the next round of mutation and avoid interpreting pointers as offset relative to the current to_space base?

It is to allow bump allocation until the end of Wasm heap without loss of efficiency. If keep using to-space we will have less room to bump the heap pointer until end of Wasm heap. Before copying GC the heap is like this:

| static heap | dynamic heap |........(unallocated)...........|
^             ^              ^                                ^
0           heap base      heap pointer                  end of Wasm heap
                           (allocation pointer)              (4GiB)

Allocation is done by bumping the heap pointer until it reaches end of Wasm heap, and calling memory.grow on the way to allocate Wasm pages when needed.

After copying live data to to-space it looks like this:

| static heap | from-space   |to-space  |....(unallocated)....|
^             ^              ^          ^                     ^
0           heap base      to-space   heap pointer          end of Wasm heap

Now if we don't move to-space back to from-space we'll have much less space to allocate (the "unallocated" part).

We could implement a slow path for allocation that, when hp == end of Wasm heap, checks if we have any space between heap base and beginning of current space. Or more generally, we could make allocation area a list of ranges on the heap (though I think we will need at most two ranges in this list), so handle cases like:

| static heap | to-space | from-space | to-space |
                                                 ^
                                            end of Wasm heap

Which becomes this after GC:

| static heap | alloc area | live data | alloc area |
                                                    ^
                                               end of Wasm heap

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, all makes sense, thanks for the explanation!

rts/motoko-rts/src/gc/copying.rs Show resolved Hide resolved
rts/motoko-rts/src/memory.rs Show resolved Hide resolved
osa1 added a commit that referenced this pull request Jun 30, 2021
Originally discussed here: #2573 (comment)

Previously grow_memory would allocate one more page than necessary when
the argument is a multiple of Wasm page size. New calculation fixes
that. The code is basically

    (ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE

I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in
u64 to avoid overflows in `ptr` is larger than `u32::MAX -
WASM_PAGE_SIZE + 1`

Diff of mo-rts.wasm before and after:

    --- mo-rts.wat  2021-06-30 08:50:10.319312072 +0300
    +++ ../../motoko_2/rts/mo-rts.wat       2021-06-30 08:50:08.507322577 +0300
    @@ -18760,12 +18760,14 @@
       (func $motoko_rts::alloc::alloc_impl::grow_memory::hd1a4f99ad46d13e9 (type 7) (param i32)
         block  ;; label = @1
           local.get 0
    -      i32.const 16
    -      i32.shr_u
    +      i64.extend_i32_u
    +      i64.const 65535
    +      i64.add
    +      i64.const 16
    +      i64.shr_u
    +      i32.wrap_i64
           memory.size
           i32.sub
    -      i32.const 1
    -      i32.add
           local.tee 0
           i32.const 1
           i32.lt_s

So two extra instructions.

CanCan backend benchmark reports 0.004% increase in cycles (42,055,656
to 42,057,358, +1,702).
mergify bot pushed a commit that referenced this pull request Jun 30, 2021
Originally discussed here: #2573 (comment)

Previously grow_memory would allocate one more page than necessary when
the argument is a multiple of Wasm page size. New calculation fixes
that. The code is basically

    (ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE

I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in
u64 to avoid overflows in `ptr` is larger than `u32::MAX -
WASM_PAGE_SIZE + 1`

Also moves `total_pages - current_pages` to the slow path to improve
performance slightly.

CanCan backend benchmark reports 0.002% increase in cycles.
Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you got rid of ic_fn, but LGTM

@osa1
Copy link
Contributor Author

osa1 commented Jun 30, 2021

Curious why you got rid of ic_fn, but LGTM

It wasn't as useful as ic_mem_fn. What it did was it converted this

#[ic_fn]
fn blah() { ... }

into

// original function, to be used in tests
fn blah(...) { ... }

// wrapper to be called from generated code
#[cfg(feature = "ic")]
#[no_mangle]
fn ic_blah(...) { blah(...) }

I can do the same without using the macro by adding cfg(...) and no_mangle manually:

#[cfg_feature = "ic"]
#[no_mangle]
fn blah() { ... }

and use it in both tests and from generated code.

@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Jun 30, 2021
@mergify mergify bot merged commit 0f55c58 into master Jun 30, 2021
@mergify mergify bot deleted the osa1/rts_tests branch June 30, 2021 12:47
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Jun 30, 2021
osa1 added a commit that referenced this pull request Jul 1, 2021
Accidental renaming was done in #2573 while renaming the "Heap" trait to
"Memory" and "heap" identifiers to "mem".
mergify bot pushed a commit that referenced this pull request Jul 1, 2021
Accidental renaming was done in #2573 while renaming the "Heap" trait to
"Memory" and "heap" identifiers to "mem".
osa1 added a commit that referenced this pull request Jul 1, 2021
In #2573 (0f55c58) I omitted creating creating closure tables and
checking them after GC, implementing it now.
mergify bot pushed a commit that referenced this pull request Jul 1, 2021
In #2573 (0f55c58) I omitted creating creating closure tables and
checking them after GC, implementing it now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants