Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fuzzing] wasm2c integration #2772

Merged
merged 139 commits into from
Apr 22, 2020
Merged
Show file tree
Hide file tree
Changes from 127 commits
Commits
Show all changes
139 commits
Select commit Hold shift + click to select a range
225e383
wip [ci skip]
kripken Jan 14, 2020
9bbef19
moar [ci skip]
kripken Jan 14, 2020
cb0637f
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 10, 2020
8d47a6c
work [ci skip]
kripken Apr 10, 2020
3d849ef
fix
kripken Apr 10, 2020
9302a8c
[ci skip]
kripken Apr 10, 2020
e430048
[ci skip]
kripken Apr 10, 2020
eab9225
[ci skip]
kripken Apr 10, 2020
ab7b31c
[ci skip]
kripken Apr 10, 2020
acdf1b5
[ci skip]
kripken Apr 10, 2020
7f0feab
[ci skip]
kripken Apr 10, 2020
d5c6347
[ci skip]
kripken Apr 11, 2020
5d0fe4c
[ci skip]
kripken Apr 11, 2020
c863d12
[ci skip]
kripken Apr 11, 2020
f68b197
trap on unaligned atomics
kripken Apr 11, 2020
33f7c29
[ci skip]
kripken Apr 11, 2020
b89815f
[ci skip]
kripken Apr 11, 2020
594fb6b
wrap [ci skip]
kripken Apr 11, 2020
09ccac9
notifyOOB [ci skip]
kripken Apr 11, 2020
ddde0f0
[ci skip]
kripken Apr 11, 2020
091213f
fix
kripken Apr 11, 2020
225c852
[ci skip]
kripken Apr 11, 2020
93eb265
fix
kripken Apr 11, 2020
b1e0da7
[ci skip]
kripken Apr 11, 2020
d006b8b
moar [ci skip]
kripken Apr 11, 2020
d243cf6
[ci skip]
kripken Apr 11, 2020
0659b54
apply [ci skip]
kripken Apr 11, 2020
cd42dc7
apply [ci skip]
kripken Apr 11, 2020
30ab1f0
test
kripken Apr 11, 2020
91dd33c
[ci skip]
kripken Apr 11, 2020
2008363
[ci skip]
kripken Apr 11, 2020
66c39c5
python3
kripken Apr 11, 2020
bfee084
[ci skip]
kripken Apr 11, 2020
e1266c5
[ci skip]
kripken Apr 11, 2020
1b6683b
[ci skip]
kripken Apr 11, 2020
cc14e69
[ci skip]
kripken Apr 11, 2020
96afbad
go
kripken Apr 11, 2020
0247207
Merge remote-tracking branch 'origin/atomic4' into atomic4
kripken Apr 11, 2020
d8e3a69
restore size
kripken Apr 11, 2020
70f6124
seeds [ci skip]
kripken Apr 11, 2020
06ffa35
back
kripken Apr 12, 2020
afb1990
[ci skip]
kripken Apr 12, 2020
94192aa
unsigned
kripken Apr 12, 2020
3e2af17
test
kripken Apr 12, 2020
48ce612
style
kripken Apr 12, 2020
7bb358b
tests
kripken Apr 12, 2020
7d6e545
fix warning
kripken Apr 12, 2020
abf1709
[ci skip]
kripken Apr 12, 2020
d0d9005
[ci skip]
kripken Apr 12, 2020
22a1934
[ci skip]
kripken Apr 12, 2020
85b9a39
fixes
kripken Apr 12, 2020
33212c9
more [ci skip]
kripken Apr 12, 2020
fd71421
style [ci skip]
kripken Apr 12, 2020
1d2cf75
[ci skip]
kripken Apr 12, 2020
89f829a
[ci skip]
kripken Apr 12, 2020
d5f6fde
[ci skip]
kripken Apr 12, 2020
a6b64a1
more [ci skip]
kripken Apr 12, 2020
f56102a
test [ci skip]
kripken Apr 12, 2020
45ce956
[ci skip]
kripken Apr 12, 2020
72e7597
[ci skip]
kripken Apr 12, 2020
5d07e34
[ci skip]
kripken Apr 12, 2020
d947ed4
[ci skip]
kripken Apr 12, 2020
65e31e3
style [ci skip]
kripken Apr 12, 2020
2da1a85
[ci skip]
kripken Apr 12, 2020
e09ca94
[ci skip]
kripken Apr 12, 2020
2a82b4b
wasm2js
kripken Apr 12, 2020
cc0d759
style [ci skip]
kripken Apr 12, 2020
3235411
Merge remote-tracking branch 'origin/atomic4' into refuzz
kripken Apr 12, 2020
16f0264
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 12, 2020
6379f47
fix [ci skip]
kripken Apr 12, 2020
26e0fb3
more
kripken Apr 12, 2020
9c8f498
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 12, 2020
6051c47
[ci skip]
kripken Apr 12, 2020
764a328
[ci skip]
kripken Apr 12, 2020
dce9c46
[ci skip]
kripken Apr 12, 2020
09abf7b
[ci skip]
kripken Apr 12, 2020
de84768
[ci skip]
kripken Apr 13, 2020
b24b2e2
[ci skip]
kripken Apr 13, 2020
f2aad14
[ci skip]
kripken Apr 13, 2020
6c529d6
[ci skip]
kripken Apr 13, 2020
c603bff
[ci skip]
kripken Apr 13, 2020
5040e0a
test [ci skip]
kripken Apr 13, 2020
788948b
[ci skip]
kripken Apr 13, 2020
7f51bfe
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 13, 2020
5970c7e
[ci skip]
kripken Apr 13, 2020
56e2e76
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 13, 2020
0538fd0
Merge remote-tracking branch 'origin/master' into refuzz
kripken Apr 13, 2020
73006a4
exec
kripken Apr 13, 2020
28a4193
fix
kripken Apr 13, 2020
59f1e0c
update test
kripken Apr 14, 2020
e1e45b0
remove unit tests as they fail on the bots; will investigate and foll…
kripken Apr 14, 2020
575ec16
more [ci skip]
kripken Apr 15, 2020
712b614
num => num_lines
kripken Apr 15, 2020
12ef3f2
Merge branch 'refuzz2' into refuzz
kripken Apr 15, 2020
d6d995b
more feedback
kripken Apr 15, 2020
b2bfa3c
fix
kripken Apr 15, 2020
95ee6c8
wasm2c fuzzing wip [ci skip]
kripken Apr 15, 2020
e30a42c
wasm2c fuzzing wip [ci skip]
kripken Apr 15, 2020
1d00799
[ci skip]
kripken Apr 15, 2020
aa0ab6d
more [ci skip]
kripken Apr 15, 2020
cc36b4b
more [ci skip]
kripken Apr 15, 2020
4eefb67
more [ci skip]
kripken Apr 15, 2020
63b7477
more [ci skip]
kripken Apr 15, 2020
ca11613
more [ci skip]
kripken Apr 15, 2020
ad60120
more [ci skip]
kripken Apr 15, 2020
f89f0dc
more [ci skip]
kripken Apr 15, 2020
1db10c8
more [ci skip]
kripken Apr 15, 2020
8d02b2c
more [ci skip]
kripken Apr 16, 2020
7bc435c
more [ci skip]
kripken Apr 16, 2020
524bd32
more [ci skip]
kripken Apr 16, 2020
f711379
[ci skip]
kripken Apr 16, 2020
84ee869
more [ci skip]
kripken Apr 16, 2020
2ff63f9
more [ci skip]
kripken Apr 16, 2020
a780987
[ci skip]
kripken Apr 16, 2020
67d25de
Merge remote-tracking branch 'origin/master' into wasm2c
kripken Apr 16, 2020
b2d7b32
fixes [ci skip]
kripken Apr 16, 2020
561dd26
style
kripken Apr 16, 2020
d55a7b1
fixes [ci skip]
kripken Apr 16, 2020
b1558e6
fix
kripken Apr 16, 2020
99f7c0f
nicer
kripken Apr 16, 2020
8ac8042
fixes
kripken Apr 16, 2020
ba396ee
Disable multivalue in fuzzer in a clearer way
kripken Apr 16, 2020
983e037
Merge remote-tracking branch 'origin/multifuzz' into wasm2c
kripken Apr 16, 2020
16ca12b
Merge remote-tracking branch 'origin/master' into wasm2c
kripken Apr 16, 2020
34b82e0
what year is this [ci skip]
kripken Apr 16, 2020
001ecc9
Merge remote-tracking branch 'origin/master' into wasm2c
kripken Apr 21, 2020
66c98ee
fix one clang-tidy issue that is unrelated to this PR
kripken Apr 21, 2020
69d93a0
more tidy attempts
kripken Apr 21, 2020
f417436
more tidying
kripken Apr 21, 2020
f204459
review feedback
kripken Apr 21, 2020
7e47ee9
more tidy
kripken Apr 21, 2020
c32f148
moar tidy
kripken Apr 21, 2020
4c076d4
Merge remote-tracking branch 'origin/master' into wasm2c
kripken Apr 22, 2020
7909e6e
Revert "moar tidy"
kripken Apr 22, 2020
a7b643c
Revert "more tidy"
kripken Apr 22, 2020
f7d1510
Revert "more tidy attempts"
kripken Apr 22, 2020
c8df1a4
review feedback
kripken Apr 22, 2020
119e399
Merge remote-tracking branch 'origin/master' into wasm2c
kripken Apr 22, 2020
32ede3a
comments
kripken Apr 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 114 additions & 55 deletions scripts/fuzz_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -292,27 +292,80 @@ def count_runs(self):
# Run VMs and compare results

class VM:
def __init__(self, name, run, deterministic_nans, requires_legalization):
def __init__(self, name, run, can_run, can_compare_to_self, can_compare_to_others):
self.name = name
self.run = run
self.deterministic_nans = deterministic_nans
self.requires_legalization = requires_legalization
self.can_run = can_run
self.can_compare_to_self = can_compare_to_self
self.can_compare_to_others = can_compare_to_others


class CompareVMs(TestCaseHandler):
def __init__(self):
super(CompareVMs, self).__init__()

def run_binaryen_interpreter(wasm):
def byn_run(wasm):
return run_bynterp(wasm, ['--fuzz-exec-before'])

def run_v8(wasm):
def v8_run(wasm):
run([in_bin('wasm-opt'), wasm, '--emit-js-wrapper=' + wasm + '.js'] + FEATURE_OPTS)
return run_vm([shared.V8, wasm + '.js'] + shared.V8_OPTS + ['--', wasm])

def yes():
return True

def if_legal_and_no_nans():
return LEGALIZE and not NANS

def if_no_nans():
return not NANS

class Wasm2C(VM):
name = 'wasm2c'

def __init__(self):
# look for wabt in the path. if it's not here, don't run wasm2c
try:
wabt_bin = run(['whereis', 'wasm2c'])
kripken marked this conversation as resolved.
Show resolved Hide resolved
# whereis returns wasm2c: PATH
wabt_bin = wabt_bin.split()[-1]
wabt_root = os.path.dirname(os.path.dirname(wabt_bin))
self.wasm2c_dir = os.path.join(wabt_root, 'wasm2c')
except Exception as e:
print('warning: no wabt found:', e)
self.wasm2c_dir = None

def can_run(self):
if self.wasm2c_dir is None:
return False
# if we legalize for JS, the ABI is not what C wants
if LEGALIZE:
return False
# wasm2c doesn't support most features
return all([x in FEATURE_OPTS for x in ['--disable-exception-handling', '--disable-simd', '--disable-threads', '--disable-bulk-memory', '--disable-nontrapping-float-to-int', '--disable-tail-call', '--disable-sign-ext', '--disable-reference-types', '--disable-multivalue']])

def run(self, wasm):
run([in_bin('wasm-opt'), wasm, '--emit-wasm2c-wrapper=main.c'] + FEATURE_OPTS)
run(['wasm2c', wasm, '-o', 'wasm.c'])
compile_cmd = ['clang', 'main.c', 'wasm.c', os.path.join(self.wasm2c_dir, 'wasm-rt-impl.c'), '-I' + self.wasm2c_dir, '-lm', '-Werror']
run(compile_cmd)
return run_vm(['./a.out'])

def can_compare_to_self(self):
# The binaryen optimizer changes NaNs in the ways that wasm
# expects, but that's not quite what C has
return not NANS

def can_compare_to_others(self):
# C won't trap on OOB, and NaNs can differ from wasm VMs
return not OOB and not NANS

self.vms = [
VM('binaryen interpreter', run_binaryen_interpreter, deterministic_nans=True, requires_legalization=False),
VM('d8', run_v8, deterministic_nans=False, requires_legalization=True),
VM('binaryen interpreter', byn_run, can_run=yes, can_compare_to_self=yes, can_compare_to_others=yes),
# with nans, VM differences can confuse us, so only very simple VMs can compare to themselves after opts in that case.
# if not legalized, the JS will fail immediately, so no point to compare to others
VM('d8', v8_run, can_run=yes, can_compare_to_self=if_no_nans, can_compare_to_others=if_legal_and_no_nans),
Wasm2C()
]

def handle_pair(self, input, before_wasm, after_wasm, opts):
Expand All @@ -323,29 +376,32 @@ def handle_pair(self, input, before_wasm, after_wasm, opts):
def run_vms(self, wasm):
results = []
for vm in self.vms:
results.append(fix_output(vm.run(wasm)))
# when a vm can't run, mark the result as None
if vm.can_run():
results.append(fix_output(vm.run(wasm)))
else:
results.append(None)

# compare between the vms on this specific input

# NaNs are a source of nondeterminism between VMs; don't compare them.
if not NANS:
first = None
for i in range(len(results)):
# No legalization for JS means we can't compare JS to others, as any
# illegal export will fail immediately.
if LEGALIZE or not vm.requires_legalization:
if first is None:
first = i
else:
compare_between_vms(results[first], results[i], 'CompareVMs between VMs: ' + self.vms[first].name + ' and ' + self.vms[i].name)
first = None
for i in range(len(results)):
# No legalization for JS means we can't compare JS to others, as any
# illegal export will fail immediately.
vm = self.vms[i]
if vm.can_compare_to_others() and results[i] is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems simpler to me to keep track of the used vms by making each element of results a tuple of a vm and its fixed output) rather than inserting holes into results. I think a list comprehension for this would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to use tuples, but am not quite sure where you wanted a list comprehension?

if first is None:
first = i
else:
compare_between_vms(results[first], results[i], 'CompareVMs between VMs: ' + self.vms[first].name + ' and ' + vm.name)

return results

def compare_before_and_after(self, before, after):
# compare each VM to itself on the before and after inputs
for i in range(len(before)):
vm = self.vms[i]
if vm.deterministic_nans:
if vm.can_compare_to_self() and before[i] is not None:
compare(before[i], after[i], 'CompareVMs between before and after: ' + vm.name)

def can_run_on_feature_opts(self, feature_opts):
Expand Down Expand Up @@ -487,7 +543,7 @@ def can_run_on_feature_opts(self, feature_opts):


# Do one test, given an input file for -ttf and some optimizations to run
def test_one(random_input, opts):
def test_one(random_input, opts, allow_autoreduce):
randomize_pass_debug()
randomize_feature_opts()
randomize_fuzz_settings()
Expand Down Expand Up @@ -535,40 +591,41 @@ def write_commands_and_test(opts):
try:
write_commands_and_test(opts)
except subprocess.CalledProcessError:
print('')
print('====================')
print('Found a problem! See "t.sh" for the commands, and "input.wasm" for the input. Auto-reducing to "reduced.wasm" and "tt.sh"...')
print('====================')
print('')
# first, reduce the fuzz opts: keep removing until we can't
while 1:
reduced = False
for i in range(len(opts)):
# some opts can't be removed, like --flatten --dfo requires flatten
if opts[i] == '--flatten':
if i != len(opts) - 1 and opts[i + 1] in ('--dfo', '--local-cse', '--rereloop'):
continue
shorter = opts[:i] + opts[i + 1:]
try:
write_commands_and_test(shorter)
except subprocess.CalledProcessError:
# great, the shorter one is good as well
opts = shorter
print('reduced opts to ' + ' '.join(opts))
reduced = True
if allow_autoreduce:
print('')
print('====================')
print('Found a problem! See "t.sh" for the commands, and "input.wasm" for the input. Auto-reducing to "reduced.wasm" and "tt.sh"...')
print('====================')
print('')
# first, reduce the fuzz opts: keep removing until we can't
while 1:
reduced = False
for i in range(len(opts)):
# some opts can't be removed, like --flatten --dfo requires flatten
if opts[i] == '--flatten':
if i != len(opts) - 1 and opts[i + 1] in ('--dfo', '--local-cse', '--rereloop'):
continue
shorter = opts[:i] + opts[i + 1:]
try:
write_commands_and_test(shorter)
except subprocess.CalledProcessError:
# great, the shorter one is good as well
opts = shorter
print('reduced opts to ' + ' '.join(opts))
reduced = True
break
if not reduced:
break
if not reduced:
break
# second, reduce the wasm
# copy a.wasm to a safe place as the reducer will use the commands on new inputs, and the commands work on a.wasm
shutil.copyfile('a.wasm', 'input.wasm')
# add a command to verify the input. this lets the reducer see that it is indeed working on the input correctly
commands = [in_bin('wasm-opt') + ' -all a.wasm'] + get_commands(opts)
write_commands(commands, 'tt.sh')
# reduce the input to something smaller with the same behavior on the script
subprocess.check_call([in_bin('wasm-reduce'), 'input.wasm', '--command=bash tt.sh', '-t', 'a.wasm', '-w', 'reduced.wasm'])
print('Finished reduction. See "tt.sh" and "reduced.wasm".')
raise Exception('halting after autoreduction')
# second, reduce the wasm
# copy a.wasm to a safe place as the reducer will use the commands on new inputs, and the commands work on a.wasm
shutil.copyfile('a.wasm', 'input.wasm')
# add a command to verify the input. this lets the reducer see that it is indeed working on the input correctly
commands = [in_bin('wasm-opt') + ' -all a.wasm'] + get_commands(opts)
write_commands(commands, 'tt.sh')
# reduce the input to something smaller with the same behavior on the script
subprocess.check_call([in_bin('wasm-reduce'), 'input.wasm', '--command=bash tt.sh', '-t', 'a.wasm', '-w', 'reduced.wasm'])
print('Finished reduction. See "tt.sh" and "reduced.wasm".')
raise Exception('halting after autoreduction')
print('')

# create a second wasm for handlers that want to look at pairs.
Expand Down Expand Up @@ -736,7 +793,9 @@ def randomize_opt_flags():
opts = randomize_opt_flags()
print('randomized opts:', ' '.join(opts))
try:
total_wasm_size += test_one(raw_input_data, opts)
# don't autoreduce if we are given a specific case to test, as this
# is a reproduction of the test case, not the first finding of it
total_wasm_size += test_one(raw_input_data, opts, allow_autoreduce=given_seed is None)
except KeyboardInterrupt:
print('(stopping by user request)')
break
Expand Down
1 change: 0 additions & 1 deletion src/tools/execution-results.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
// Shared execution result checking code
//

#include "ir/import-utils.h"
#include "shell-interface.h"
#include "wasm.h"

Expand Down
17 changes: 16 additions & 1 deletion src/tools/wasm-opt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
#include "wasm-printing.h"
#include "wasm-s-parser.h"
#include "wasm-validator.h"
#include "wasm2c-wrapper.h"

#define DEBUG_TYPE "opt"

Expand Down Expand Up @@ -87,6 +88,7 @@ int main(int argc, const char* argv[]) {
bool fuzzOOB = true;
std::string emitJSWrapper;
std::string emitSpecWrapper;
std::string emitWasm2CWrapper;
std::string inputSourceMapFilename;
std::string outputSourceMapFilename;
std::string outputSourceMapUrl;
Expand Down Expand Up @@ -185,6 +187,14 @@ int main(int argc, const char* argv[]) {
[&](Options* o, const std::string& arguments) {
emitSpecWrapper = arguments;
})
.add("--emit-wasm2c-wrapper",
"-esw",
"Emit a C wrapper file that can run the wasm after it is compiled "
kripken marked this conversation as resolved.
Show resolved Hide resolved
"with wasm2c, useful for fuzzing",
Options::Arguments::One,
[&](Options* o, const std::string& arguments) {
emitWasm2CWrapper = arguments;
})
.add("--input-source-map",
"-ism",
"Consume source map from the specified file",
Expand Down Expand Up @@ -293,13 +303,18 @@ int main(int argc, const char* argv[]) {
outfile << generateJSWrapper(wasm);
outfile.close();
}

if (emitSpecWrapper.size() > 0) {
std::ofstream outfile;
outfile.open(emitSpecWrapper, std::ofstream::out);
outfile << generateSpecWrapper(wasm);
outfile.close();
}
if (emitWasm2CWrapper.size() > 0) {
std::ofstream outfile;
outfile.open(emitWasm2CWrapper, std::ofstream::out);
outfile << generateWasm2CWrapper(wasm);
outfile.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why generating this wrapper makes sense as part of wasm-opt. It seems like a separate function to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a new tool I suppose, but we have 3 wrapper generators now (spec, js, wasm2c), and it's convenient to run them from wasm-opt so you can emit the wrapper as you generate the fuzz code, in a single invocation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk, it seems similar to a lot of other auxiliary functionality in wasm-opt like the fuzzing stuff or the JS wrappers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, there is a precedent here. In that case that fine for now.

But maybe as a followup this should be a separate tool? So you would wasm-opt and then wasm-generate-fuzz-wrapper -type=wasm2c or something less clunky than that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe that's better. Another option would be to make all 3 of those be passes.

}

std::string firstOutput;

Expand Down
Loading