Skip to content

[RFC] Remove "normalized to NFKC" clause from the reference manual, section 3.1 #12388

@omasanori

Description

@omasanori

From The Rust Reference Manual;

Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8, normalized to Unicode normalization form NFKC.

However, NFKC requires to transform some characters into different ones even in strings or comments and then we will get different results on such cases. Even NFC have some problems if we have to preserve a text strictly.
(yes, the word different is ambiguous; in NFKC, they are treated as the same, but the glyphs of them are different... sometimes depends on the font, though)

I'd suggest to remove the "normalized to NFKC" clause and leave the input, like golang. From The Go Programming Language Specification:

The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions