-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reintroduce wasm-merge #5709
Reintroduce wasm-merge #5709
Conversation
The one maybe odd part there is the mixture of positional and non-positional items. Making one depend on the other is not something I'm familiar with in commandline tools. Or is there precedent? |
Another option for the commandline UI might be to forgo positional arguments entirely, as they are potentially confusing when mixed with others. We could have this:
Basically we could have |
One similar use of |
Code and tests LGTM. I believe the only remaining questions are 1) the CLI UI for naming modules, and 2) what options to provide and what to do by default for conflicting export names. |
Very good point... I guess there is precedent for such stuff. I don't really have a strong feeling of what's best for the CLI UI. Perhaps we can bikeshed this on the Monday meeting if there's time. For handling of export conflicts, do you agree we should have 3 modes like I was getting at before?
If so perhaps 2 should be the default, as the least surprising, as you've been suggesting? |
Yes! Sounds good.
That sounds great to me. |
Ok, 3 modes + new default are now implemented. |
src/tools/wasm-merge.cpp
Outdated
// this is useful when the first module is the main program and the others are | ||
// libraries of code that it uses, but that do not have any exports intended | ||
// to be used by anyone other than the main program. | ||
KeepOnlyFirstModuleExports, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think it would be more useful for this third mode to collect all the exports and use module order only to resolve conflicts. That would still expose any non-conflicting exports from later modules. The reason this is more useful is that exporting more things than the embedder expects does not cause linkage failures, so this would cover both the use cases covered by KeepOnlyFirstModuleExports
as well as use cases that need the other exports to be exposed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't catch this earlier when you were describing the 3 modes you had in mind!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. That would allow more things to work. But it would also add more effort for the user in the common case this is meant to address, which is a main module + side modules, since we never want the side module's exports to remain. But maybe that's just more work we should expect the user to do, hmm... they also likely would want to prune some main module exports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more thought this does seem better. Changed to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉 🎉 🎉
Co-authored-by: Thomas Lively <tlively@google.com>
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently use cases have been showing up in the wasm GC space and elsewhere, as people are using more diverse toolchains together, for example a project might build some C++ code alongside some wasm GC code. Merging those wasm files together can allow for nice optimizations like inlining and better DCE etc., so it makes sense to have a tool for merging. Background: * Removal: WebAssembly#1969 * Requests: * wasm-merge - why it has been deleted WebAssembly#2174 * Compiling and linking wat files WebAssembly#2276 * wasm-link? WebAssembly#2767 This PR is a compete rewrite of wasm-merge, not a restoration of the original codebase. The original code was quite messy (my fault), and also, since then we've added multi-memory and multi-table which makes things a lot simpler. The linking semantics are as described in the "wasm-link" issue WebAssembly#2767 : all we do is merge normal wasm files together and connect imports and export. That is, we have a graph of modules and their names, and each import to a module name can be resolved to that module. Basically, like a JS bundler would do for JS, or, in other words, we do the same operations as JS code would do to glue wasm modules together at runtime, but at compile time. See the README update in this PR for a concrete example. There are no plans to do more than that simple bundling, so this should not really overlap with wasm-ld's use cases. This should be fairly fast as it works in linear time on the total input code. However, it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each module. An advantage to working on Binaryen IR is that we can easily do some global DCE after merging, and further optimizations are possible later.
We used to have a
wasm-merge
tool but removed it for a lack of use cases. Recentlyuse cases have been showing up in the wasm GC space and elsewhere, as people are
using more diverse toolchains together, for example a project might build some C++
code alongside some wasm GC code. Merging those wasm files together can allow
for nice optimizations like inlining and better DCE etc., so it makes sense to have a
tool for merging.
Removal: #1969
Requests:
This PR is a compete rewrite of wasm-merge, not a restoration of the original
codebase. The original code was quite messy (my fault), and also, since then
we've added multi-memory and multi-table which makes things a lot simpler.
The linking semantics are as described in the "wasm-link" issue #2767 : all we do
is merge normal wasm files together and connect imports and export. That is, we
have a graph of modules and their names, and each import to a module name can
be resolved to that module. Basically, like a JS bundler would do for JS, or, in other
words, we do the same operations as JS code would do to glue wasm modules
together at runtime, but at compile time. See the README update in this PR for a
concrete example.
There are no plans to do more than that simple bundling, so this should not
really overlap with
wasm-ld
's use cases.This should be fairly fast as it works in linear time on the total input code. However,
it won't be as fast as
wasm-ld
, of course, as it does build Binaryen IR for eachmodule. An advantage to working on Binaryen IR is we could optimize right after
merging, but in this first version no optimizations are done yet.
Perhaps this new tool should be given a new name, to differentiate it from the
old
wasm-merge
? Other name options might bewasm-link
orwasm-bundle
perhaps.