-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow character substitution of wasm symbol names #13477
Allow character substitution of wasm symbol names #13477
Conversation
3314918
to
3d9ecef
Compare
Ah, interesting idea... It is a nice approach I had not even considered. The more general alternative, however, is to use response files, |
Getting CMake to notice changes to config files like that is a pain in my experience. I managed to do it before by copying and modifying em_add_tracked_link_flag, but it's not fun at all. I like this PR, as sometimes you only have a small handful of functions. But would a PR to add something like em_add_tracked_link_flag for the ASYNCIFY options also be useful? |
If we think that response files are the correct solution, then I think we should disallow passing symbols directly at all to the function, as we should not be offering broken functionality to users. Though I think response files are an awkward way to solve the issue. If one has only a few functions that need ASYNCIFYing, they may not immediately realize why they would have to use response files, and then they won't do it, and will get cross-platform broken behavior.
Maybe it would help (could a more general option work as well? Or I wonder if recent CMake might have that kind of feature out of the box. Haven't looked into it in a while), but certainly not a solution here, since this issue is not CMake related, but happens already on command line. |
b2ece16
to
f1a28c3
Compare
@curiousdannii good point about noticing changes in build systems, yeah... |
f1a28c3
to
c2a2369
Compare
Long term I'd like to push folks towards using a file on disk here instead.. and supporting a simple "one symbol per line" file format. This seems like a reasonable interim helping hand though. |
@kripken I wonder if the test I added is somehow wrong use of the ASYNCIFY feature? I am getting |
Yes, I think that's an asyncify assertion. When you use the "only" list then only those methods are instrumented. I think adding |
…ound complex string escaping issues in operating system shells and build systems.
c2a2369
to
cc40bdb
Compare
I'd like to recommend the new "one-symbol-per-line" file format as an alternative that avoids the need to any kind of quoting: #14170. |
Allow character substitution of wasm symbol names to avoid working around complex string escaping issues in operating system shells and build systems.
Wasm toolchain compiled functions do not support name mangling. As result, the generated function names can contain difficult to deal with characters
,
&
and,
(and maybe others).When attempting to pass a complex signature to a function, such as
foo(char const*, int&)
to ASYNCIFY_ONLY/ASYNCIFY_ADD/ASYNCIFY_REMOVE, one might as a first try passHowever that has the problem that it passes the cmdline args
emcc
,-s
,ASYNCIFY_ONLY=[foo(char
,const*,
,int&)]
,...
from the terminal (as one would expect is sensible). Then, adding quotes, one getsOn Windows, the command line interpreter consumes the quotes
"
to denote a single cmdline argument element, Emscripten will not see them, but will receiveemcc
,-s
,ASYNCIFY_ONLY=[foo(char const*, int&)]
,...
.However that has the issue with
,
character in the function name, since it is treated as JSON array element separator. Hence emcc suggests to quote the arguments. Hence, one getsHowever that does not work either on Windows, because the
&
character works as a special delimiter character for issuing two commands from the command line. (echo a & echo b
printsa<newline>b
). On Windows&
cannot be escaped with\&
, nor with&&
, but the escape symbol is..."&"
. Hence one finally getswhich actually works, just barely, and I have no idea why or how. Maybe Windows command line interpreter first short-circuit scans for
"&"
->&
without paying attention to any reasonable left-to-right or right-to-left"
for any quote matching rules?.However, when one takes that syntax to any kind of build system (CMake, Ninja, MinGW Makefiles, Tundra, ...), things just spectacularly fall apart again. One needs to at least escape
"
with\"
, then double escape\
with\\
, leading into an utterly unreadable, and even worse, unmaintainable mess. And on top of that, the"&"
escaping no longer seems to work, but Windows sees it again as a two command delimiter.Finally, Linux and macOS have different terminal quoting rules, which leads to needing to start all of this from scratch on those OSes to come up with quoting and escaping rules that pass the terminals there. This can be observed e.g. at https://github.com/gabrielcuvillier/d3wasm/blob/ece7b9869eee5f4927b07efe3cf4146d6a562205/neo/CMakeLists.txt#L240 where two custom quoting rules have been devised for Windows vs Linux. CC @gabrielcuvillier
To avoid all of these issues, this PR bluntly adds single character substitution rules to avoid needing to fight build system and OS shell expansion rules:
.
&
->#
,
->?
This way one can call on command line
No quotes, no escapes, no bs, and guaranteed to work cross platforms. Very unreadable, but still more readable than a sea of
\\\\""
s, and safe and less brittle. Developing custom tooling in build systems for automated single character substitution and unsubstitution will be at least three orders of magnitudes easier than multicharacter backslash-double-quote escape-unescape matching quotes searching substitution.Are there other
-s
settings that take in raw wasm function names without mangling?