You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The spec just says "the obvious" but assuming that the canonical form is a double-quoted string (which is not immediately obvious) and that all printable non-newline characters should be printed unescaped, what is the canonical form of a string that contains:
Newlines
C0 control codes
C1 control codes
Surrogate codes
All of these have multiple valid representations inside an nb-double-one-line
The text was updated successfully, but these errors were encountered:
The canonical form of a scalar node with tag tag:yaml.org,2002:str is the same as the formatted content of the node.
Whether a scalar is double-quoted, single-quoted, plain, or in block form is a presentational detail, as is escaping. For instance, the scalars 'foo' and "fo\x6f" have the same formatted content. Those two scalars are perfectly interchangeable for all purposes, regardless of the tag. An implementation may present scalars in whatever style and with whatever escaping it chooses.
Note that scalar content cannot include surrogates. The content of a scalar is a sequence of zero or more Unicode characters. Surrogates are not Unicode characters — a high or low surrogate code point does not correspond to any character. C0 and C1 control codes (including line feeds and carriage returns) are Unicode characters.
(Notwithstanding the above, a single astral character in a double-quoted scalar may be represented by two escape sequences specifying the code points of a surrogate pair. For instance, the scalar '𝄞' may also be presented as "\ud834\udd1e". In either case, the formatted content of the scalar is the single character 𝄞. This feature is purely for JSON compatibility, because the same scalar could also be presented as "\U0001d121", which avoids surrogates entirely.)
The spec just says "the obvious" but assuming that the canonical form is a double-quoted string (which is not immediately obvious) and that all printable non-newline characters should be printed unescaped, what is the canonical form of a string that contains:
All of these have multiple valid representations inside an nb-double-one-line
The text was updated successfully, but these errors were encountered: