Description
Extracted from #14523.
This is a proposal for a partial reversal of #8262. The reasoning behind that proposal was:
- exported symbols with null bytes may not be representable in the target object format (for example Any
export
called@""
is not exported #12230) - avoid bugs with third party tools which store zig identifiers as null-terminated strings
- allows for a more efficient representation of identifier names as strings within zig compilers
The problem we face today is the use case of serializing a key-value map into an anonymous struct literal. If any map keys contain null bytes, or has length zero, a syntax error occurs, making many kinds of maps not representable. This is especially a problem with the introduction of the ZON format. Even JSON has the ability to represent null bytes in map keys. It would be a crime to introduce a new data exchange format and have it be worse than JSON at representing a mere map data structure.
I will address all three points above now.
exported symbols with null bytes may not be representable in the target object format
We already need an additional layer of validation for target object formats, as noted in the original issue. This validation should be per-target. For example, if COFF rules are different than Mach-O rules, there should be a different set of compile errors for these formats.
Additionally, there could be a future object format, for a not-yet-existing operating system, that allows null bytes in symbols and length zero symbol names. Zig should support such a hypothetical target. Therefore it should not forbid this at the language layer.
avoid bugs with third party tools which store zig identifiers as null-terminated strings
Unfortunately, while addressing this is desirable, it takes a backseat to other concerns mentioned in this proposal.
allows for a more efficient representation of identifier names as strings within zig compilers
If the language allows empty identifier names and null bytes in identifier names, but every supported target object format does not, then the Zig compiler can emit a suitable compile error in the lowering of AST. Just because the language allows it, does not mean it will succeed compilation on every target.
This way the zig compiler can still use null-termination in the later stages of compilation, still have correct error reporting, and yet the language specification would not forbid this, allowing ZON to take advantage of this, as well as a hypothetical future object format.
If this is accepted, the only thing needed to be done to implement this issue is likely to make slight adjustments to the wording of relevant compile errors.