-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand binary name section to include type, table, memory, global, and label names. #1064
Expand binary name section to include type, table, memory, global, and label names. #1064
Conversation
BinaryEncoding.md
Outdated
[function index space](Modules.md#function-index-space). This may include both | ||
module-defined or imported functions, but is only meaningful for module-defined | ||
functions. The `name_map` for a given function assigns names to labels within | ||
the function, with labels identified by the offset from the start of the function's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, using raw byte offsets pierces the abstraction level on which the rest of the binary format is defined and makes the mapping a bit less robust. I wonder if there is an alternative. For example, couldn't we just assume an implicit numbering of a function's structured control instructions in order of appearance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered an implicit numbering of the control flow operators first, but the benefit to using a byte offset is that it doesn't require tracking additional state when decoding the instructions to compute some implicit label index for the instruction being decoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends on the implementation. If you have already constructed an AST and thrown away your pointers to the binary then it actually requires keeping more information around, and longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A third option would be to use an instruction index that can hopefully be reused for other debug information.
I'm fine with any of those options, but I think it should be consistent across the various types of debug metadata. #1051 (source maps) and #990 (standard call stack format) both use file offsets (though not necessarily function-relative). Perhaps there's an argument that those should use instruction indices instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dschuff Any thoughts on this (and its consistency with the source maps and call stack formatting PRs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndrewScheidecker It feels like the purpose of the name section is assigning optional names to 'logical' entities in a module, which ought to work independent of the current representation of that module, be it binary encoding (with possibly different byte sequences encoding the same semantics), text format (where label names would also be used) or some data structure in memory (where the original encoding bytes might not be available anymore, or not yet, or possibly never at all).
As such the name section should probably use something that has meaning independent of any concrete sequence of bytes, either the 'new index space for labels' idea (most consistent with what the name section does for all other kinds of entities) or using an instruction index.
Adding names to arbitrary positions in the instruction sequence is probably a different feature from adding names to control flow labels, though.
I think you're right, there are discussions to be had about whether, in the case of Webassembly, either one or both of stack traces and source maps should rather refer to positions in a concrete file (byte or text offsets, which is what they traditionally do) or to logical instructions in a function instead. I think there's arguments for both approaches. That however seems like a very different thing from adding names to specific entities in the name section, which probably should depend neither on the former two nor on the concrete binary encoding.
e9e5a16
to
956f1c9
Compare
I updated this to index labels sequentially according to the order of the operators that introduce them. |
This generally lgtm. Could you put it on the agenda for the next CG meeting (not tomorrow's since that's too tight an addition, but the next one when it's posted)? Would you also be able to attend and present? |
I will be traveling Dec 11-25, so I probably cannot attend the next video call. I can still add it to the agenda if you'd like. |
@AndrewScheidecker you can put it in a section of "items to discuss at next meeting", noting your preference to be present. That way we can punt the item until you can attend. |
Can we name memory segments too? This would be useful for simplifying the static linking which requires naming of segments and currently does this using its own custom section. |
We don't currently reference memory or table segments by index, but the conditional segment initializer proposal will change that with the addition of the |
We're using twiggy to inspect our generated wasm images (<3 <3 to the rust team for writing it) and the non-code bits are a significant part of our image. This would help us a lot figure out what's taking all the space. |
How should I proceed with this PR? I don't think it's controversial, but is very stale now. FWIW, I have been maintaining a fork of LLD that writes this extended name section, with the exception of the label names (I think getting that information in the linker requires some larger changes). |
@AndrewScheidecker yes, this PR is too old to land as-is. I think it's useful and a reasonable extension of the current name section format. I think the right way to move forward is to make it a new proposal. Currently it would be phase 0, so we'd need to bring it to the CG to move it to phase 1. At that point we can fork the spec repo and make the change to appropriate place in the spec (https://github.com/WebAssembly/spec/blob/master/document/core/appendix/custom.rst) |
I've created a proposal repo here and hope to present it at the CG meeting tomorrow: https://github.com/AndrewScheidecker/wasm-extended-name-section I'll close this PR now, but if anybody would like to provide feedback here on the proposal repo, I'd appreciate it! |
https://github.com/AndrewScheidecker/wasm-extended-name-section says it is archived. What does that mean? |
The repo has moved here: https://github.com/WebAssembly/extended-name-section |
This is meant to resolve #750.