[exec] Bounds check bulk-memory before execution #123

eqrion · 2019-11-05T16:45:35Z

Spec issue: #111

This commit changes the semantics of bulk-memory instructions to perform
an upfront bounds check and trap if any access would be out-of-bounds without
writing.

This affects the following:

memory.init/copy/fill
table.init/copy (fill requires reftypes)
data segment init (lowers to memory.init)
elem segment init (lowers to table.init)

This commit corrects an edge case where a memory/table.init with zero
length wouldn't trap even if the data/elem segment had been dropped.
This was due to an extra pattern in eval that would match zero-length
instructions and not check if the segment had been dropped. One test
case in bulk.wast had to be changed for this.

eqrion · 2019-11-05T16:46:23Z

I would also be interested in updating the spec-text as well, but will need some guidance.

Right now the instructions are implemented by lowering to plain i32.load8_u/i32.store8's for memory and table.get/table.set for tables.

I think there are a couple options here.

Use the stack as a temporary buffer.
- Lower to a series of loads onto the stack, followed by a sequence of stores from high addresses to low.
Add new administrative functions.
- transfer_memory(src: vec(u8), memory: mem_address, dest: offset)
- transfer_elements(src: vec(elem), table: tab_address, dest: offset)
- Lower init,copy,fill to these by constructing the appropriate src buffer.
Add a bounds check at the beginning of the existing semantics, keep everything else.

I think all of these can be equivalent in the single-threaded case, but I'm having a hard time reasoning about the implications for threading.

rossberg · 2019-11-05T19:12:47Z

Hi @eqrion, thanks for moving forward with this. Turns out we were in a race. :) I had started on a similar patch last week but didn't get to finish the spec part yet. I uploaded it here so that we can compare. But mine is more WIP. In particular, I did not properly fix the test generators. And I only started on the spec itself. So we should probably go with yours instead.

However, one relevant difference is that I kept and adapted the small-step semantics for the interpreter instead of going back to monolithic implementations. This is important for two reasons: (1) the interpreter should match the spec rules, and (2) the rules need to use small-step in order to be forward-compatible with the threads proposal, where bulk ops are no longer atomic operations.

How would you feel about incorporating my interpreter implementation into your patch?

eqrion · 2019-11-05T21:47:43Z

Hi @eqrion, thanks for moving forward with this. Turns out we were in a race. :) I had started on a similar patch last week but didn't get to finish the spec part yet. I uploaded it here so that we can compare. But mine is more WIP. In particular, I did not properly fix the test generators. And I only started on the spec itself. So we should probably go with yours instead.

However, one relevant difference is that I kept and adapted the small-step semantics for the interpreter instead of going back to monolithic implementations. This is important for two reasons: (1) the interpreter should match the spec rules, and (2) the rules need to use small-step in order to be forward-compatible with the threads proposal, where bulk ops are no longer atomic operations.

How would you feel about incorporating my interpreter implementation into your patch?

Ah apologies about the duplicated effort, I should have posted that I was taking a look at this. :)

Good to know about needing the small-step semantics, I think that was what I was getting at with my question. I'll adapt your interpreter implementation into my patch.

rossberg · 2019-11-06T06:17:22Z

Cool, thanks!

Forgot one thing. You mentioned:

This commit corrects an edge case where a memory/table.init with zero
length wouldn't trap even if the data/elem segment had been dropped.

This actually was an intentional fix in #115. The rationale being that accessing a dropped segment is equivalent to an out-of-bounds access, because a dropped segment is essentially a segment shrunk to size 0. A zero-length access should therefore behave consistently with no effect.

eqrion · 2019-11-06T19:38:24Z

Forgot one thing. You mentioned:

This commit corrects an edge case where a memory/table.init with zero
length wouldn't trap even if the data/elem segment had been dropped.

This actually was an intentional fix in #115. The rationale being that accessing a dropped segment is equivalent to an out-of-bounds access, because a dropped segment is essentially a segment shrunk to size 0. A zero-length access should therefore behave consistently with no effect.

Ah, I believe the exec-text for memory.init needs an update then. Step 10 does a trap if the data segment has been dropped, before step 17 does the check for cnt = 0. That was my reasoning for calling this a bug.

Spidermonkey will also need an update for this, as we will trap in this case.

eqrion · 2019-11-06T22:21:46Z

Ah, I believe the exec-text for memory.init needs an update then. Step 10 does a trap if the data segment has been dropped, before step 17 does the check for cnt = 0. That was my reasoning for calling this a bug.

On further examination, it appears that the formal rules are correct and the prose is what may need an update.

rossberg · 2019-11-07T04:37:11Z

Ah, I see. If you want, please feel free to fix this as part of this PR.

eqrion · 2019-11-07T06:39:08Z

I've updated the interpreter commit to use your patch, keep the small step semantics, and extend early bounds checking to the table operations. That commit should be good to go.

I've also added a commit for the dropped segments issue, and started looking at the spec text.

rossberg

Thanks!

document/core/exec/instructions.rst

test/core/bulk.wast

interpreter/exec/eval.ml

Spec issue: WebAssembly#111 This commit changes the semantics of bulk-memory instructions to perform an upfront bounds check and trap if any access would be out-of-bounds without writing. This affects the following: * memory.init/copy/fill * table.init/copy (fill requires reftypes) * data segment init (lowers to memory.init) * elem segment init (lowers to table.init)

The formal rules (and interpreter) allow 'memory/table.init' when the count = 0. The prose specifies a trap before checking if count = 0.

eqrion · 2019-11-12T22:21:42Z

@rossberg Is this ready to go now?

rossberg

Yes, thanks!

eqrion · 2019-11-14T14:11:22Z

Thanks!

The two primary changes involved are: 1. Removal of `assert_return_canonical_nan`/`arithetic nan` in favor of special `nan:canonical`/`nan:arithmetic` constants that can only be used in test expectations. See: WebAssembly/spec#1104 2. New trapping behaviour for bulk memory operations. Range checks are now performed up front for opterations such as memory.fill and memory.copy. See: WebAssembly/bulk-memory-operations#111 And: WebAssembly/bulk-memory-operations#123 The old behaviour is still kept around to support table.fill which is defined in reference-types proposal and has yet to be updated.

The two primary changes involved are: 1. Removal of `assert_return_canonical_nan`/`arithetic nan` in favor of special `nan:canonical`/`nan:arithmetic` constants that can only be used in test expectations. See: WebAssembly/spec#1104 2. New trapping behaviour for bulk memory operations. Range checks are now performed up front for opterations such as memory.fill and memory.copy. See: WebAssembly/bulk-memory-operations#111 And: WebAssembly/bulk-memory-operations#123 The old behaviour is still kept around to support table.fill which is defined in reference-types proposal and has yet to be updated. 3. nullref is now permitted in the text and binary format.

The two primary changes involved are: 1. Removal of `assert_return_canonical_nan`/`arithetic nan` in favor of special `nan:canonical`/`nan:arithmetic` constants that can only be used in test expectations. See: WebAssembly/spec#1104 2. New trapping behaviour for bulk memory operations. Range checks are now performed up front for opterations such as memory.fill and memory.copy. See: WebAssembly/bulk-memory-operations#111 And: WebAssembly/bulk-memory-operations#123 The old behaviour is still kept around to support table.fill which is defined in reference-types proposal and has yet to be updated.

eqrion force-pushed the trap-semantics branch from b48f259 to 0a57f0f Compare November 7, 2019 06:34

rossberg reviewed Nov 7, 2019

View reviewed changes

eqrion force-pushed the trap-semantics branch from 0a57f0f to ca594d9 Compare November 7, 2019 16:02

eqrion added 2 commits November 7, 2019 10:13

[spec] Fix prose text for mem/tab.init with dropped segments

766a295

The formal rules (and interpreter) allow 'memory/table.init' when the count = 0. The prose specifies a trap before checking if count = 0.

eqrion force-pushed the trap-semantics branch from ca594d9 to 766a295 Compare November 7, 2019 16:14

tlively mentioned this pull request Nov 8, 2019

Optimize passive segments in memory-packing WebAssembly/binaryen#2426

Merged

rossberg approved these changes Nov 14, 2019

View reviewed changes

rossberg merged commit be7ce1b into WebAssembly:master Nov 14, 2019

eqrion deleted the trap-semantics branch November 14, 2019 14:11

This was referenced Nov 21, 2019

[interpreter] Simplify zero-len and drop semantics #126

Merged

[spec] Make OOB spec changes to follow #123 and #126 #127

Closed

sbc100 mentioned this pull request Dec 24, 2019

table.fill tests should match new bounds check semantics from bulk-memory proposal WebAssembly/reference-types#70

Closed

sbc100 mentioned this pull request Dec 26, 2019

Update testsuite WebAssembly/wabt#1275

Merged

lars-t-hansen mentioned this pull request Jul 3, 2020

table.fill needs to bounds check before executing table.set? WebAssembly/reference-types#105

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[exec] Bounds check bulk-memory before execution #123

[exec] Bounds check bulk-memory before execution #123

eqrion commented Nov 5, 2019

eqrion commented Nov 5, 2019

rossberg commented Nov 5, 2019

eqrion commented Nov 5, 2019

rossberg commented Nov 6, 2019

eqrion commented Nov 6, 2019

eqrion commented Nov 6, 2019

rossberg commented Nov 7, 2019

eqrion commented Nov 7, 2019

rossberg left a comment

eqrion commented Nov 12, 2019

rossberg left a comment

eqrion commented Nov 14, 2019

[exec] Bounds check bulk-memory before execution #123

[exec] Bounds check bulk-memory before execution #123

Conversation

eqrion commented Nov 5, 2019

eqrion commented Nov 5, 2019

rossberg commented Nov 5, 2019

eqrion commented Nov 5, 2019

rossberg commented Nov 6, 2019

eqrion commented Nov 6, 2019

eqrion commented Nov 6, 2019

rossberg commented Nov 7, 2019

eqrion commented Nov 7, 2019

rossberg left a comment

Choose a reason for hiding this comment

eqrion commented Nov 12, 2019

rossberg left a comment

Choose a reason for hiding this comment

eqrion commented Nov 14, 2019