[fuzzing] wasm2c integration #2772

kripken · 2020-04-16T20:22:07Z

This adds support for fuzzing with wabt's wasm2c that @binji wrote.
Basically we compile the wasm to C, then compile the C to a native
executable with a custom main() to wrap around it. The executable
should then print exactly the same as that wasm when run in either
the binaryen interpreter or in a JS VM with our wrapper JS for that
wasm. In other words, compiling the wasm to C is another way to
run that wasm.

I ran this for many hours and could not find any bugs in wasm2c.
Nice work there!

The main reasons I want this are to fuzz wasm2c itself, and to
have another option for fuzzing emcc. For the latter, we do fuzz
wasm-opt quite a lot, but that doesn't fuzz the non-wasm-opt
parts of emcc. And using wasm2c for that is nice since the
starting point is always a wasm file, which means we
can use tools like wasm-reduce and so forth, which can be
integrated with this fuzzer.

This also:

Refactors the fuzzer harness a little to make it easier to
add more "VMs" to run wasms in.
Do not autoreduce when re-running a testcase, which I hit
while developing this.

src/tools/wasm-opt.cpp

sbc100 · 2020-04-21T18:56:17Z

src/tools/wasm-opt.cpp

+    std::ofstream outfile;
+    outfile.open(emitWasm2CWrapper, std::ofstream::out);
+    outfile << generateWasm2CWrapper(wasm);
+    outfile.close();


I'm not sure why generating this wrapper makes sense as part of wasm-opt. It seems like a separate function to me.

It could be a new tool I suppose, but we have 3 wrapper generators now (spec, js, wasm2c), and it's convenient to run them from wasm-opt so you can emit the wrapper as you generate the fuzz code, in a single invocation.

idk, it seems similar to a lot of other auxiliary functionality in wasm-opt like the fuzzing stuff or the JS wrappers.

Oh I see, there is a precedent here. In that case that fine for now.

But maybe as a followup this should be a separate tool? So you would wasm-opt and then wasm-generate-fuzz-wrapper -type=wasm2c or something less clunky than that.

Yeah, maybe that's better. Another option would be to make all 3 of those be passes.

scripts/fuzz_opt.py

tlively · 2020-04-17T00:58:51Z

scripts/fuzz_opt.py

+            # No legalization for JS means we can't compare JS to others, as any
+            # illegal export will fail immediately.
+            vm = self.vms[i]
+            if vm.can_compare_to_others() and results[i] is not None:


It seems simpler to me to keep track of the used vms by making each element of results a tuple of a vm and its fixed output) rather than inserting holes into results. I think a list comprehension for this would be nice.

I changed to use tuples, but am not quite sure where you wanted a list comprehension?

tlively · 2020-04-21T19:09:24Z

src/tools/wasm-opt.cpp

+    std::ofstream outfile;
+    outfile.open(emitWasm2CWrapper, std::ofstream::out);
+    outfile << generateWasm2CWrapper(wasm);
+    outfile.close();


idk, it seems similar to a lot of other auxiliary functionality in wasm-opt like the fuzzing stuff or the JS wrappers.

src/tools/wasm2c-wrapper.hpp

tlively · 2020-04-21T19:15:33Z

src/tools/wasm2c-wrapper.hpp

+
+)";
+
+  for (auto& exp : wasm.exports) {


I'm also having trouble understanding what this loop is adding and why. More comments would be very welcome.

Sorry about that, I added a bunch more comments now.

kripken · 2020-04-21T19:22:23Z

Review comments addressed, but the clang-tidy stuff is an increasing rabbithole... the one file i added is now leading it to find interesting problems in many other files that existed before, but I guess it never scanned them until this PR.

See for example the last commit for the stuff I need to do here, and probably a lot left: c32f148

What's the best thing to do here? not sure if I'm doing the right thing in trying to fix these... cc @aheejin

tlively · 2020-04-21T19:27:55Z

Are we enforcing clang-tidy now? Before it was just clang-format.

sbc100 · 2020-04-21T19:36:58Z

Are we enforcing clang-tidy now? Before it was just clang-format.

I think the intention was that were enforcing it, at least we were running that script. I did fix a couple possible bug in the script leading up to the github actions transition. @aheejin certainly believed it to be previously working.

Do we not want to enforce it? Does it generate false positives?

kripken · 2020-04-21T20:02:04Z

For the latter case of wanting fuzz emcc, are you basically just using wasm2c here as way to generate C source? Kind like csmith but where the C code itself comes from a wasm file? Kind of like using the wasm input as a seed to generate a bunch of C that might trip emcc up?

Yeah, that's basically it. It's easy to do and gives another source of C programs for fuzzing. And we have good tools around this for reduction on the wasm, options like no OOB and no NaNs, etc., already working on wasm.

But also, this helps fuzz wasm2c itself. I think that has some cool use cases and it's nice to know it's been fuzzed before recommending it broadly I think.

This reverts commit c32f148.

This reverts commit 7e47ee9.

This reverts commit 69d93a0.

aheejin · 2020-04-22T02:23:12Z

Review comments addressed, but the clang-tidy stuff is an increasing rabbithole... the one file i added is now leading it to find interesting problems in many other files that existed before, but I guess it never scanned them until this PR.

See for example the last commit for the stuff I need to do here, and probably a lot left: c32f148

What's the best thing to do here? not sure if I'm doing the right thing in trying to fix these... cc @aheejin

I can take a deeper look and run clang-tidy locally to see what's happening, but I think our current .clang-tidy effectively only enables readability-braces-around-statements, which enforces {} on one-line ifs and loops. What kind of clang-tidy errors are you seeing?

Also, if code is in a header file, it is checked only when it is included in a source file IIRC.

sbc100 · 2020-04-22T02:46:36Z

Sorry, I found that clang-tidy problem, I'd forgotten that it needs to compilation database in order to be useful.

tlively

👍

tlively · 2020-04-23T02:37:04Z

@kripken I have WABT installed locally, but when I run the fuzzer with this change I am getting

clang-9: error: no such file or directory: '/usr/local/google/home/tlively/local/wasm2c/wasm-rt-impl.c'

Do you know what that might be about? cc @binji

kripken · 2020-04-23T02:46:26Z

The code assumes you build wabt with a build dir in the wabt root. So that if it finds wasm2c at /a/my_wabt_root/build_dir/wasm2c then it can find the wasm2c support files directory at /a/my_wabt_root/wasm2c/. Do you have just the binaries, maybe, or a different kind of build dir setup?

tlively · 2020-04-23T08:20:41Z

My build dir is in the wabt root, but after I build I do ninja install to install the binaries to my ~/local/bin directory. Is the runtime implementation something that should be installed alongside the binaries?

kripken · 2020-04-23T13:50:18Z

Oh, yes, we need those runtime support files to build with. We can't copy them into binaryen here because they need to match what wabt emits. If there's an install step, perhaps it should install those then?

binji · 2020-04-23T15:30:09Z

Yeah, I guess it should... maybe to /usr/share?

kripken added 30 commits January 14, 2020 11:16

wip [ci skip]

225e383

moar [ci skip]

9bbef19

Merge remote-tracking branch 'origin/master' into refuzz

cb0637f

work [ci skip]

8d47a6c

fix

3d849ef

[ci skip]

9302a8c

[ci skip]

e430048

[ci skip]

eab9225

[ci skip]

ab7b31c

[ci skip]

acdf1b5

[ci skip]

7f0feab

[ci skip]

d5c6347

[ci skip]

5d0fe4c

[ci skip]

c863d12

trap on unaligned atomics

f68b197

[ci skip]

33f7c29

[ci skip]

b89815f

wrap [ci skip]

594fb6b

notifyOOB [ci skip]

09ccac9

[ci skip]

ddde0f0

fix

091213f

[ci skip]

225c852

fix

93eb265

[ci skip]

b1e0da7

moar [ci skip]

d006b8b

[ci skip]

d243cf6

apply [ci skip]

0659b54

apply [ci skip]

cd42dc7

test

30ab1f0

[ci skip]

91dd33c

sbc100 reviewed Apr 21, 2020

View reviewed changes

kripken added 3 commits April 21, 2020 12:06

more tidying

f417436

review feedback

f204459

more tidy

7e47ee9

tlively reviewed Apr 21, 2020

View reviewed changes

moar tidy

c32f148

kripken added 4 commits April 21, 2020 18:27

Merge remote-tracking branch 'origin/master' into wasm2c

4c076d4

Revert "moar tidy"

7909e6e

This reverts commit c32f148.

Revert "more tidy"

a7b643c

This reverts commit 7e47ee9.

Revert "more tidy attempts"

f7d1510

This reverts commit 69d93a0.

kripken added 3 commits April 22, 2020 10:41

review feedback

c8df1a4

Merge remote-tracking branch 'origin/master' into wasm2c

119e399

comments

32ede3a

tlively approved these changes Apr 22, 2020

View reviewed changes

kripken merged commit 35a36b1 into master Apr 22, 2020

kripken deleted the wasm2c branch April 22, 2020 19:11

This was referenced Jul 28, 2020

WasmBoxC: Simple Easy and Fast VM-less Sandboxing guevara/read-it-later#6937

Open

WasmBoxC: Simple Easy and Fast VM-less Sandboxing guevara/read-it-later#6938

Open

WasmBoxC: Simple Easy and Fast VM-less Sandboxing guevara/read-it-later#6939

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fuzzing] wasm2c integration #2772

[fuzzing] wasm2c integration #2772

kripken commented Apr 16, 2020

sbc100 Apr 21, 2020

kripken Apr 21, 2020

tlively Apr 21, 2020

sbc100 Apr 21, 2020

kripken Apr 21, 2020

tlively Apr 17, 2020

kripken Apr 22, 2020

tlively Apr 21, 2020

tlively Apr 21, 2020

kripken Apr 22, 2020

kripken commented Apr 21, 2020 •

edited

Loading

tlively commented Apr 21, 2020

sbc100 commented Apr 21, 2020

kripken commented Apr 21, 2020

aheejin commented Apr 22, 2020

sbc100 commented Apr 22, 2020

tlively left a comment

tlively commented Apr 23, 2020

kripken commented Apr 23, 2020 •

edited

Loading

tlively commented Apr 23, 2020

kripken commented Apr 23, 2020

binji commented Apr 23, 2020

[fuzzing] wasm2c integration #2772

[fuzzing] wasm2c integration #2772

Conversation

kripken commented Apr 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken commented Apr 21, 2020 • edited Loading

tlively commented Apr 21, 2020

sbc100 commented Apr 21, 2020

kripken commented Apr 21, 2020

aheejin commented Apr 22, 2020

sbc100 commented Apr 22, 2020

tlively left a comment

Choose a reason for hiding this comment

tlively commented Apr 23, 2020

kripken commented Apr 23, 2020 • edited Loading

tlively commented Apr 23, 2020

kripken commented Apr 23, 2020

binji commented Apr 23, 2020

kripken commented Apr 21, 2020 •

edited

Loading

kripken commented Apr 23, 2020 •

edited

Loading