-
Notifications
You must be signed in to change notification settings - Fork 466
[feature] Roundtrip names with strange symbols in text format #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
How about allowing quoted names: |
Good idea. Seems like a simple enough change, and matches what the text format already does for quoted strings. What do you think, @rossberg? |
Identifiers are semantically relevant in the text format. Allowing random
strings would thus have undesirable implications. In particular, it would
pull in Unicode into a central piece of the text semantics and get us into
the business of defining the right equivalence on arbitrary Unicode strings
or their (possibly malformed?) encodings. I'd rather not go there, IME it's
a rabbit hole.
Just for round-tripping the easier solution IMO would be the annotation
mechanism we discussed earlier. We could easily allow annotations of the
form (@name "...") on binders that you can fall back to. Unlike
identifiers, their role is limited to mapping the name section, so they
don't interfere with semantics. WDYT?
|
It does make sense to apply the same well-formed UTF-8 constraint as the import/export strings, but why would it be necessary to define equivalence as anything other than byte-wise comparison? If we allow imports/exports to be distinguished by equivalent UTF-8 strings, why not these names? |
I agree w/ @AndrewScheidecker that this seems to be a similar situation to import/export names. That said, I also think that if we have the general mechanism for custom section annotations, that would work fine too. That seems like it requires more design work than extending the syntax for identifiers though. |
@AndrewScheidecker, fair enough, but we would still introduce the situation where there are many different ways to spell the same identifier, e.g., using unicode escapes, raw UTF-8 hex escapes, quotes vs no quotes, etc., which is undesirable IMO. Unlike import/export names, which are string labels for external interaction so that they have to be language-agnostic and universal (and don't have any meaning inside Wasm itself), free form quoting is not something typically found for internal identifiers. I can see the temptation to view symbolic identifiers as a reflection of the name section, but that wasn't the intended purpose. |
I think it's acceptable if the same identifier can be written multiple ways: e.g.
We want to disassemble names from languages with arbitrary syntax, and produce valid WAT syntax. The simplest way to do that is to allow arbitrary strings in WAT identifier syntax. The annotation proposal tries to avoid the issue by adding a name annotation that takes an arbitrary string, but as I mentioned here, that doesn't replace a good WAT identifier that can be used as an argument of |
In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see WebAssembly/spec#617 and WebAssembly/annotations#21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see WebAssembly/spec#617 and WebAssembly/annotations#21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see WebAssembly/spec#617 and WebAssembly/annotations#21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see WebAssembly/spec#617 and WebAssembly/annotations#21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
This is now supported with string-style identifiers, closing. |
See WebAssembly/wabt#685 (comment). We currently generate a name section using the name provided like
$foo
. This doesn't work for all names that are allowed by the binary format. Should we have a way to represent these names in the text format?The text was updated successfully, but these errors were encountered: