fuzz: differential V8 engine occasionally crashes #4786

abrown · 2022-08-25T19:47:21Z

Test Case

No test case produced.

Steps to Reproduce

$ ALLOWED_ENGINES=-spec,-wasmi cargo +nightly fuzz run differential -s none

Run for enough time to crash.

Expected Results

Not to crash.

Actual Results

...
=== Execution rate (1023 successes / 6000 attempted modules): 17.05% ===
        wasmi: 0.00%, spec: 0.00%, wasmtime: 87.70%, v8: 12.30%
        wasm-smith: 44.81%, single-inst: 55.19%
...
#6851   NEW    cov: 20559 ft: 63794 corp: 1188/146Kb lim: 170 exec/s: 360 rss: 106Mb L: 138/170 MS: 2 CopyPart-CopyPart-
#6862   NEW    cov: 20559 ft: 63795 corp: 1189/146Kb lim: 170 exec/s: 361 rss: 106Mb L: 132/170 MS: 1 CopyPart-
#6864   NEW    cov: 20560 ft: 63796 corp: 1190/146Kb lim: 170 exec/s: 361 rss: 106Mb L: 103/170 MS: 2 EraseBytes-ChangeBit-
#6869   NEW    cov: 20563 ft: 63809 corp: 1191/146Kb lim: 170 exec/s: 361 rss: 106Mb L: 169/170 MS: 5 ShuffleBytes-InsertRepeatedBytes-InsertRepeatedBytes-ShuffleBytes-PersAutoDict- DE: "\xff\xff\xff\x07"-
#6988   NEW    cov: 20563 ft: 63838 corp: 1192/146Kb lim: 170 exec/s: 349 rss: 106Mb L: 129/170 MS: 4 InsertRepeatedBytes-ChangeBit-EraseBytes-InsertRepeatedBytes-
────────────────────────────────────────────────────────────────────────────────

Error: Fuzz target exited with signal: 11 (core dumped)

Versions and Environment

Wasmtime version or commit: main

Operating system: Fedora 35

Architecture: x86_64

The text was updated successfully, but these errors were encountered:

abrown · 2022-08-25T19:48:05Z

cc: @alexcrichton, guess that crash we saw wasn't the spec interpreter after all.

alexcrichton · 2022-08-30T04:12:44Z

I haven't been able to reproduce this locally unfortunately. When I let the fuzzer run long enough it hit #4812 before hitting this fault.

Can you try running the fuzzer in gdb to see if you can grab a stack trace of the fault? (presuming it doesn't take too long to fault). When using gdb though it's difficult b/c segfaults can normally happen for wasm and there's no great way to handle this. I was able to do handle SIGILL nostop locally in gdb but if you do handle SIGSEGV nostop I think that when it actually hits the crashing failure it lets the process exit which is no good, so you may have to babysit it a bit and double-check that SIGSEGV doesn't look like a normal wasm SIGSEGV (rr might be super helpful here in this regard)

bjorn3 · 2022-08-30T07:38:51Z

Maybe you can do handle SIGSEGV nostop but also add a breakpoint where the signal handler propagates the SIGSEGV if it didn't originate from the wasm module?

alexcrichton · 2024-10-11T22:21:52Z

I've been unable to fuzz with v8 locally for the longest time so I tried to dig into this today. Turns out it's an issue with memory protection keys so this'll only affect new enough x64 hardware with new enough glibc. That I think would explain why Andrew initially ran into this and I couldn't reproduce. I think that would also help explain why it hasn't showed up on oss-fuzz at all despite being trivial to reproduce locally.

My best read of the situation is that if you run with v8 and differential fuzzing long enough locally it will eventually crash. This crash, when I've inspected it, always looks like this:

Thread 1 "differential" received signal SIGSEGV, Segmentation fault.
0x00002061775c8170 in ?? ()
(gdb) disas $rip,$rip+20
Dump of assembler code from 0x2061775c8170 to 0x2061775c8184:
=> 0x00002061775c8170:  jmp    *0x2(%rip)        # 0x2061775c8178

The memory being accessed is just below %rip itself, and the memory address of 0x00002061775c8178 is indeed readable in gdb. This is in fact an indirect jump to Builtins_WasmLiftoffFrameSetup all the time. The problem is that if you print $_siginfo.si_code in gdb it yields 4 which corresponds to SEGV_PKUERR.

So the problem here is that JIT code isn't allowed to read this code due to MPK and the status of the PKRU. The corruption here seems to be:

V8 looks like it uses a single pkey at most.
This is entirely unrelated to Wasmtime's PKRU support, if that's disabled the issue still happens.
V8 assumes that JIT code executes with pkey 1.
V8 uses pkeys to disallow writes to JIT code with pkey 1.
V8 assumes a steady-state of pkey 1 is accessible (executable/readable) but not writable
V8, when it modifies JIT code, frobs the PKRU to teporarily allow writes

This is all fine and dandy and Wasmtime shouldn't interfere with it, except it does. The problem is apparently signal handlers (it's always signal handlers). V8 assumes a steady-state PKRU register of 0x5...58. That means no access to keys >=2, readonly/executable access to pkey 1, and all access to pkey 0. When a signal is received, however, it resets the PKRU register to 0x5..54. This notably disables access to all keys >=1, disagreeing with V8's steady state.

The problem here is that when Wasmtime executes, hits a signal, recovers, and goes back into V8, now V8 is in a corrupt state. V8 thinks JIT code can access pkey 1, but it in fact cannot. A reproduction of this problem can be found in this gist which uses the v8 crate to run some code, then a segfault is generated and caught, then code is run again. This second execution crashes.

Ok so what to do about this. I found that v8::V8::set_flags_from_string("--no-memory_protection_keys") looks promising. That fixes the standalone test case here but it causes V8 to segfault in a different location when passed during fuzzing. According to the man page about pkeys signal handlers need to manually restore the state after they're received. Looks like sig{set,long}jmp don't solve this automatically.

I have run out of ideas of how to fix this. If OSS-Fuzz picks up hardware with MPK we'll probably have to disable differential execution with v8 if nothing else changes.

alexcrichton · 2024-10-11T22:26:34Z

That fixes the standalone test case here but it causes V8 to segfault in a different location when passed during fuzzing.

Looks like this flag doesn't fully disable MPK. The crash is in V8 with si_code = 4 which shows pkey misconfiguration again. So looks like that flag only partially disables MPK.

abrown added the bug Incorrect behavior in the current implementation that needs fixing label Aug 25, 2022

abrown added the fuzzing Issues related to our fuzzing infrastructure label Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzz: differential V8 engine occasionally crashes #4786

fuzz: differential V8 engine occasionally crashes #4786

abrown commented Aug 25, 2022

abrown commented Aug 25, 2022

alexcrichton commented Aug 30, 2022

bjorn3 commented Aug 30, 2022

alexcrichton commented Oct 11, 2024

alexcrichton commented Oct 11, 2024

fuzz: differential V8 engine occasionally crashes #4786

fuzz: differential V8 engine occasionally crashes #4786

Comments

abrown commented Aug 25, 2022

Test Case

Steps to Reproduce

Expected Results

Actual Results

Versions and Environment

abrown commented Aug 25, 2022

alexcrichton commented Aug 30, 2022

bjorn3 commented Aug 30, 2022

alexcrichton commented Oct 11, 2024

alexcrichton commented Oct 11, 2024