Fix compilation determinism issue #7443

Ekleog · 2022-08-19T15:20:57Z

There was a compilation determinism issue in our instrumentation, due to iterating over a HashMap when generating the thunks. With this, I add:

A fuzzer that tries to automatically detect non-determinism instances
An output from this fuzzer that generates said non-determinism; auto-generated so that when running cargo test the regression would be caught
The fix for the compilation determinism issue per se

I've been running the fuzzer locally for a bit to check there's no other obvious instance of compilation non-determinism, and coverage has already doubled since the fuzzer found the crash I checked in so hopefully other instances of compilation non-determinism would be hard to find.

cc @akhi3030 @matklad

akhi3030 · 2022-08-19T16:37:06Z

runtime/near-vm-runner/src/instrument/stack_height/thunk.rs

@@ -3,7 +3,7 @@ use parity_wasm::{
    builder,
    elements::{self, FunctionType, Internal},
 };
-use std::collections::HashMap as Map;


I think we should create an issue to review all uses of HashMap, HashSet, etc. in our code to ensure that there aren't any non-determinism issues there.

FWIW, I'd be curious to hear some case studies of how to best enforce determinism consistently.

The simplest option is indeed to use tree-based collections, but my understanding is that it is pretty inefficient, as collection lookups are usually hot in profiles, and trees are way less CPU&memory efficient than hashmaps. No gut feeling for just how inefficient it is.

The second simplest option is to use hash-maps with deterministic iteration (index map, what modern python is doing). Again, this is trading perf for determinism, but I don't know how much perf. I think this would probably be my default pick for making deterministic software.

The third option is to make sure that HashMap doesn't expose non-deterministic API (doesn't implement iter). I think this is what rustc is doing, not sure how successful it is.

Interestingly, I think in rust-analyzer we've never ever hit this problem. This is surprising to me, as, given our incremental compilation setup, the non-determinism would lead to straightforward panic, so it'd be hard to miss. I think that is because in rust-analyzer we are using FxHashMap which doesn't do randomization (so, DOS-vulnerable).

@akhi3030 I’ve opened #7507 to track doing this :)

runtime/near-vm-runner/src/tests/fuzzers.rs

matklad

Nice, this fix the whole class of bugs for us!

matklad · 2022-08-22T09:21:33Z

runtime/near-vm-runner/src/instrument/stack_height/thunk.rs

@@ -3,7 +3,7 @@ use parity_wasm::{
    builder,
    elements::{self, FunctionType, Internal},
 };
-use std::collections::HashMap as Map;


FWIW, I'd be curious to hear some case studies of how to best enforce determinism consistently.

The simplest option is indeed to use tree-based collections, but my understanding is that it is pretty inefficient, as collection lookups are usually hot in profiles, and trees are way less CPU&memory efficient than hashmaps. No gut feeling for just how inefficient it is.

The second simplest option is to use hash-maps with deterministic iteration (index map, what modern python is doing). Again, this is trading perf for determinism, but I don't know how much perf. I think this would probably be my default pick for making deterministic software.

The third option is to make sure that HashMap doesn't expose non-deterministic API (doesn't implement iter). I think this is what rustc is doing, not sure how successful it is.

Interestingly, I think in rust-analyzer we've never ever hit this problem. This is surprising to me, as, given our incremental compilation setup, the non-determinism would lead to straightforward panic, so it'd be hard to miss. I think that is because in rust-analyzer we are using FxHashMap which doesn't do randomization (so, DOS-vulnerable).

matklad · 2022-08-22T09:31:55Z

runtime/near-vm-runner/src/tests/fuzzers.rs

@@ -167,3 +170,31 @@ fn wasmer2_and_wasmtime_agree() {
        assert_eq!(wasmer2, wasmtime);
    });
 }
+
+#[test]
+fn wasmer2_is_reproducible() {


I see we also attach a crash file here crash-7efd70f50949e0fdff8f4c8d6cc66709faf6d3cf. I am a tiny bit skeptical of the utility of them -- it seems it's relatively easy for the crash file to become obsolete with unrelated changes.

I think what is in the crash file is data: &[u8], the seed we pass to arbitrary. But that means that, eg, upgrading arbitrary can chance the .wasm that is generated, which might break the test.

It seems to me that, for generative fuzzing [u8] -> T we should save T, rather than [u8], as a crash. So, in this case, an actual .wasm.

Action items here:

this probably wants some design discussion on the bolero repository

I think I am fine with including the current crash file here, but I am curious whether the default 1000 iterations would be enough to get it? If it's easy to discover, I suggest not adding a file maybe.

While what you say makes sense, what would be a better option in your opinion? I don’t think we can do this straight from bolero, because we’d have to have some serialization method for storing the crash report, and Arbitrary is as good a serialization method as any other afaict.

The alternative I can think of that would make sense would be to just add a unit test instead of checking in a crash report. I didn’t go that way first, but could switch to this method if you think it’d be better?

Oh and, it took my computer approx. 20min/1hr to find the crash from scratch, so unfortunately I don’t think just not adding an explicit test is a solution. (Also, I let the computer search for a few more hours after that and didn’t find any other determinism issue, so hopefully the rest of our code is ok-ish)

Oh and, it took my computer approx. 20min/1hr to find the crash from scratch

Yeah, that's a good argument for keeping this as is.

I don’t think we can do this straight from bolero, because we’d have to have some serialization method for storing the crash report, and Arbitrary is as good a serialization method as any other afaict.

That's the crux of my thinking here -- I do think that Arbitrary is less good serialization method, because it is meaningfully less stable. Specifically:

if we store raw bytes, and do cargo upgrade -p wasm-smith, the test could break

if we store wasm generated by arbitrary by some particualar version of wasm-smith, than the test would be reproducible even after upgrades.

A slightly different way to get at this: it might be the case that some .wasms are impossible to generate via wasm-smith, while raw .wasm naturally can represent any .wasm. The generation routine is not necessary surjective, while proper serialization is.

OTOH, I also don't see small-incremental solutions here. A "large" solution would be something like the following

traint TestInput : arbitrary::Arbitrary // to generate from raw bytes + Display // to show on failures { // to save/load examples independent from arbitrary, // or for manual table-driven tests. fn dump(&self) -> Vec<u8> fn load(Vec<u8>) -> Result<Self> }

Hmm so I’m not entirely sure how that’d work out, because even Arbitrary is but one of the ways to generate inputs with bolero, there’s also bolero’s own TypeGenerator and ValueGenerator, and I think there are even other generators they’re considering adding.

That said what you’re saying does make sense to me: it’d be good to be able to provide a "value" instead of a fuzzer seed, if only so that it becomes possible to use a known crash input for the fuzzer. I had opened camshaft/bolero#91 to track this, though I’m not sure how doable-in-practice that’d be. Does it make sense to you?

matklad · 2022-08-22T09:32:31Z

runtime/near-vm-runner/src/tests/fuzzers.rs

 use near_primitives::runtime::fees::RuntimeFeesConfig;
 use near_primitives::version::PROTOCOL_VERSION;
 use near_vm_errors::{FunctionCallError, VMError};
 use near_vm_logic::mocks::mock_external::MockedExternal;
 use near_vm_logic::{VMConfig, VMContext};
+use wasmer_engine::Executable;



unrelated to the PR, but we don't leave blank lines between imports.

runtime/near-vm-runner/src/tests/fuzzers.rs

matklad · 2022-08-22T09:35:59Z

runtime/near-vm-runner/src/tests/fuzzers.rs

+                match first_hash {
+                    None => first_hash = Some(hash),
+                    Some(h) => assert_eq!(h, hash),
+                }


Maybe:

let first_hash = *first_hash.get_or_insert(hash); assert_eq!(first_hash, hash);

Hmm so this would definitely work, but I’m not sure the loss in readability is worth the gain in conciseness?

Yeah, I was actually a bit confused about the original control flow here, hence I tried to think about a different way to write this, but it's true that get_or_insert isn't necessary readable.

What about the following pattern?

let mut prev_hash = None; for 0..3 { ... if let Some(h) = prev_hash { assert_eq!(h, hash); } prev_hash = Some(hash); }

Or maybe a mix of the two, like this?

let mut first_hash = None; for _ in 0..3 { // ... if let None = first_hash { first_hash = Some(hash); } assert_eq!(first_hash, Some(hash)); }

Now I must say I don’t have any strong preferences, so unless someone comes forward with a strong preference for one of these styles I’ll just go forward and land it :)

I also don't have a strong opinion at all. I just saw a bike shed like discussion and felt like jumping in =).

matklad · 2022-09-01T12:33:15Z

@Ekleog-NEAR do you want to apply automerge label here? By default, PR author merges PRs: https://github.com/near/nearcore/blob/master/CONTRIBUTING.md#after-the-pr-is-submitted

Ekleog-NEAR · 2022-09-01T15:59:30Z

Just did it, thank you for the ping!

@akhi3030

There was a compilation determinism issue in our instrumentation, due to iterating over a HashMap when generating the thunks. With this, I add: - A fuzzer that tries to automatically detect non-determinism instances - An output from this fuzzer that generates said non-determinism; auto-generated so that when running `cargo test` the regression would be caught - The fix for the compilation determinism issue per se I've been running the fuzzer locally for a bit to check there's no other obvious instance of compilation non-determinism, and coverage has already doubled since the fuzzer found the crash I checked in so hopefully other instances of compilation non-determinism would be hard to find. cc @akhi3030 @matklad

Ekleog added 3 commits August 19, 2022 15:56

near-vm-runner: add fuzzer checking for compilation determinism

ea3f8ce

near-vm-runner: check-in crash reproducing compilation determinism issue

e26b556

near-vm-runner: fix compilation determinism issue

1e7e378

Ekleog requested a review from a team as a code owner August 19, 2022 15:20

Ekleog requested a review from nikurt August 19, 2022 15:20

near-vm-runner: fix formatting

f6e9300

akhi3030 reviewed Aug 19, 2022

View reviewed changes

nikurt approved these changes Aug 19, 2022

View reviewed changes

matklad approved these changes Aug 22, 2022

View reviewed changes

matklad approved these changes Aug 30, 2022

View reviewed changes

akhi3030 approved these changes Aug 31, 2022

View reviewed changes

Ekleog-NEAR added the S-automerge label Sep 1, 2022

Merge branch 'master' into fix-compilation-determinism-issue

ee75b93

near-bulldozer bot merged commit 93c6372 into near:master Sep 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix compilation determinism issue #7443

Fix compilation determinism issue #7443

Ekleog commented Aug 19, 2022

akhi3030 Aug 19, 2022

matklad Aug 22, 2022

Ekleog-NEAR Aug 30, 2022

matklad left a comment

matklad Aug 22, 2022

matklad Aug 22, 2022

Ekleog-NEAR Aug 30, 2022

Ekleog-NEAR Aug 30, 2022

matklad Aug 30, 2022

Ekleog-NEAR Sep 1, 2022

matklad Aug 22, 2022

matklad Aug 22, 2022

Ekleog-NEAR Aug 30, 2022

matklad Aug 30, 2022

akhi3030 Aug 30, 2022

Ekleog-NEAR Aug 30, 2022

akhi3030 Aug 30, 2022

matklad commented Sep 1, 2022

Ekleog-NEAR commented Sep 1, 2022

Fix compilation determinism issue #7443

Fix compilation determinism issue #7443

Conversation

Ekleog commented Aug 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matklad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matklad commented Sep 1, 2022

Ekleog-NEAR commented Sep 1, 2022