-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fuzzing] wasm2c integration #2772
Conversation
std::ofstream outfile; | ||
outfile.open(emitWasm2CWrapper, std::ofstream::out); | ||
outfile << generateWasm2CWrapper(wasm); | ||
outfile.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why generating this wrapper makes sense as part of wasm-opt. It seems like a separate function to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a new tool I suppose, but we have 3 wrapper generators now (spec, js, wasm2c), and it's convenient to run them from wasm-opt so you can emit the wrapper as you generate the fuzz code, in a single invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk, it seems similar to a lot of other auxiliary functionality in wasm-opt like the fuzzing stuff or the JS wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, there is a precedent here. In that case that fine for now.
But maybe as a followup this should be a separate tool? So you would wasm-opt
and then wasm-generate-fuzz-wrapper -type=wasm2c
or something less clunky than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe that's better. Another option would be to make all 3 of those be passes.
scripts/fuzz_opt.py
Outdated
# No legalization for JS means we can't compare JS to others, as any | ||
# illegal export will fail immediately. | ||
vm = self.vms[i] | ||
if vm.can_compare_to_others() and results[i] is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems simpler to me to keep track of the used vms by making each element of results
a tuple of a vm and its fixed output) rather than inserting holes into results
. I think a list comprehension for this would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to use tuples, but am not quite sure where you wanted a list comprehension?
std::ofstream outfile; | ||
outfile.open(emitWasm2CWrapper, std::ofstream::out); | ||
outfile << generateWasm2CWrapper(wasm); | ||
outfile.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk, it seems similar to a lot of other auxiliary functionality in wasm-opt like the fuzzing stuff or the JS wrappers.
src/tools/wasm2c-wrapper.hpp
Outdated
|
||
)"; | ||
|
||
for (auto& exp : wasm.exports) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also having trouble understanding what this loop is adding and why. More comments would be very welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that, I added a bunch more comments now.
Review comments addressed, but the clang-tidy stuff is an increasing rabbithole... the one file i added is now leading it to find interesting problems in many other files that existed before, but I guess it never scanned them until this PR. See for example the last commit for the stuff I need to do here, and probably a lot left: c32f148 What's the best thing to do here? not sure if I'm doing the right thing in trying to fix these... cc @aheejin |
Are we enforcing clang-tidy now? Before it was just clang-format. |
I think the intention was that were enforcing it, at least we were running that script. I did fix a couple possible bug in the script leading up to the github actions transition. @aheejin certainly believed it to be previously working. Do we not want to enforce it? Does it generate false positives? |
Yeah, that's basically it. It's easy to do and gives another source of C programs for fuzzing. And we have good tools around this for reduction on the wasm, options like no OOB and no NaNs, etc., already working on wasm. But also, this helps fuzz wasm2c itself. I think that has some cool use cases and it's nice to know it's been fuzzed before recommending it broadly I think. |
I can take a deeper look and run clang-tidy locally to see what's happening, but I think our current .clang-tidy effectively only enables Also, if code is in a header file, it is checked only when it is included in a source file IIRC. |
Sorry, I found that clang-tidy problem, I'd forgotten that it needs to compilation database in order to be useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The code assumes you build wabt with a build dir in the wabt root. So that if it finds wasm2c at |
My build dir is in the wabt root, but after I build I do |
Oh, yes, we need those runtime support files to build with. We can't copy them into binaryen here because they need to match what wabt emits. If there's an install step, perhaps it should install those then? |
Yeah, I guess it should... maybe to |
This adds support for fuzzing with wabt's wasm2c that @binji wrote.
Basically we compile the wasm to C, then compile the C to a native
executable with a custom
main()
to wrap around it. The executableshould then print exactly the same as that wasm when run in either
the binaryen interpreter or in a JS VM with our wrapper JS for that
wasm. In other words, compiling the wasm to C is another way to
run that wasm.
I ran this for many hours and could not find any bugs in wasm2c.
Nice work there!
The main reasons I want this are to fuzz wasm2c itself, and to
have another option for fuzzing emcc. For the latter, we do fuzz
wasm-opt quite a lot, but that doesn't fuzz the non-wasm-opt
parts of emcc. And using wasm2c for that is nice since the
starting point is always a wasm file, which means we
can use tools like
wasm-reduce
and so forth, which can beintegrated with this fuzzer.
This also:
Refactors the fuzzer harness a little to make it easier to
add more "VMs" to run wasms in.
Do not autoreduce when re-running a testcase, which I hit
while developing this.