Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTS: fix pages calculation in grow_memory #2633

Merged
merged 3 commits into from
Jun 30, 2021
Merged

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Jun 30, 2021

Originally discussed here: #2573 (comment)

Previously grow_memory would allocate one more page than necessary when
the argument is a multiple of Wasm page size. New calculation fixes
that. The code is basically

(ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE

I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in
u64 to avoid overflows in ptr is larger than u32::MAX - WASM_PAGE_SIZE + 1

Also moves total_pages - current_pages to the slow path to improve
performance slightly.

CanCan backend benchmark reports 0.002% increase in cycles.

Originally discussed here: #2573 (comment)

Previously grow_memory would allocate one more page than necessary when
the argument is a multiple of Wasm page size. New calculation fixes
that. The code is basically

    (ptr + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE

I.e. divide argument by WASM_PAGE_SIZE, round up. However we do this in
u64 to avoid overflows in `ptr` is larger than `u32::MAX -
WASM_PAGE_SIZE + 1`

Diff of mo-rts.wasm before and after:

    --- mo-rts.wat  2021-06-30 08:50:10.319312072 +0300
    +++ ../../motoko_2/rts/mo-rts.wat       2021-06-30 08:50:08.507322577 +0300
    @@ -18760,12 +18760,14 @@
       (func $motoko_rts::alloc::alloc_impl::grow_memory::hd1a4f99ad46d13e9 (type 7) (param i32)
         block  ;; label = @1
           local.get 0
    -      i32.const 16
    -      i32.shr_u
    +      i64.extend_i32_u
    +      i64.const 65535
    +      i64.add
    +      i64.const 16
    +      i64.shr_u
    +      i32.wrap_i64
           memory.size
           i32.sub
    -      i32.const 1
    -      i32.add
           local.tee 0
           i32.const 1
           i32.lt_s

So two extra instructions.

CanCan backend benchmark reports 0.004% increase in cycles (42,055,656
to 42,057,358, +1,702).
@dfinity-ci
Copy link

This PR does not affect the produced WebAssembly code.

@osa1
Copy link
Contributor Author

osa1 commented Jun 30, 2021

I managed to improve perf of new code a little bit, pushing the commit now.

Improves cycles in a benchmark from 42,057,358 to 42,056,507 (-0.002%),
moving `total_pages - current_pages` to the slow path.
Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that correct? What if current pages = 0 and ptr = 0, then this allocates nothing and dereferencing ptr traps. Or is my arithmetic wrong. On mobile.

@osa1
Copy link
Contributor Author

osa1 commented Jun 30, 2021

Good point, ptr = 0 is another case where the new function will return differently than the current one, but that case cannot happen. This function is called to make sure we have enough Wasm pages, so when we want to allocate N words we do grow_memory(HP + N). For HP + N to be 0 both HP and N need to be 0. HP can't be zero because we will always have some static data in heap and HP will start with a larger value. N can't be zero because I think we never try to allocate 0 words.

@crusso
Copy link
Contributor

crusso commented Jun 30, 2021

Doesn't the problem recur at every page boundary. Suppose HP = PAGESIZE-1 and N = 1. Then we need to allocate another page but the algorithm won't, or will it?

@crusso
Copy link
Contributor

crusso commented Jun 30, 2021

(I'm assuming ptr is meant to be useable (in bounds) after the call.)

@osa1
Copy link
Contributor Author

osa1 commented Jun 30, 2021

(I'm assuming ptr is meant to be useable (in bounds) after the call.)

ptr is not guaranteed to be allocated after this call, ptr - 1 is.

@crusso
Copy link
Contributor

crusso commented Jun 30, 2021

Ah, maybe adjust the comment to say ptr (exclusive)

@crusso
Copy link
Contributor

crusso commented Jun 30, 2021

That makes sense now, HP points at the next available byte (not last allocated), right?

@osa1
Copy link
Contributor Author

osa1 commented Jun 30, 2021

That makes sense now, HP points at the next available byte (not last allocated), right?

Exactly, yeah.

Co-authored-by: Claudio Russo <claudio@dfinity.org>
@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Jun 30, 2021
@mergify mergify bot merged commit 4935cea into master Jun 30, 2021
@mergify mergify bot deleted the osa1/grow_memory_n_pages branch June 30, 2021 09:59
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Jun 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants