diff --git a/spec/formatting.md b/spec/formatting.md
index c14145121..f04897565 100644
--- a/spec/formatting.md
+++ b/spec/formatting.md
@@ -502,7 +502,7 @@ Next, using `res`, resolve the preferential order for all message keys:
1. Let `key` be the `var` key at position `i`.
1. If `key` is not the catch-all key `'*'`:
1. Assert that `key` is a _literal_.
- 1. Let `ks` be the resolved value of `key`.
+ 1. Let `ks` be the resolved value of `key` in Unicode Normalization Form C.
1. Append `ks` as the last element of the list `keys`.
1. Let `rv` be the resolved value at index `i` of `res`.
1. Let `matches` be the result of calling the method MatchSelectorKeys(`rv`, `keys`)
@@ -516,6 +516,9 @@ The returned list MAY be empty.
The most-preferred key is first,
with each successive key appearing in order by decreasing preference.
+The resolved value of each _key_ MUST be in Unicode Normalization Form C ("NFC"),
+even if the _literal_ for the _key_ is not.
+
If calling MatchSelectorKeys encounters any error,
a _Bad Selector_ error is emitted
and an empty list is returned.
diff --git a/spec/syntax.md b/spec/syntax.md
index ea55af8a0..24ea52318 100644
--- a/spec/syntax.md
+++ b/spec/syntax.md
@@ -444,6 +444,12 @@ A _key_ can be either a _literal_ value or the "catch-all" key `*`.
The **_catch-all key_** is a special key, represented by `*`,
that matches all values for a given _selector_.
+The value of each _key_ MUST be treated as if it were in
+[Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC").
+Two _keys_ are considered equal if they are canonically equivalent strings,
+that is, if they consist of the same sequence of Unicode code points after
+Unicode Normalization Form C has been applied to both.
+
## Expressions
An **_expression_** is a part of a _message_ that will be determined
@@ -690,6 +696,20 @@ except for U+0000 NULL or the surrogate code points U+D800 through U+DFFF.
All code points are preserved.
+> [!IMPORTANT]
+> Most text, including that produced by common keyboards and input methods,
+> is already encoded in the canonical form known as
+> [Unicode Normalization Form C](https://unicode.org/reports/tr15) ("NFC").
+> A few languages, legacy character encoding conversions, or operating environments
+> can result in _literal_ values that are not in this form.
+> Some uses of _literals_ in MessageFormat,
+> notably as the value of _keys_,
+> apply NFC to the _literal_ value during processing or comparison.
+> While there is no requirement that the _literal_ value actually be entered
+> in a normalized form,
+> users are cautioned to employ the same character sequences
+> for equivalent values and, whenever possible, ensure _literals_ are in NFC.
+
A **_quoted literal_** begins and ends with U+005E VERTICAL BAR `|`.
The characters `\` and `|` within a _quoted literal_ MUST be
escaped as `\\` and `\|`.
@@ -714,21 +734,6 @@ number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "
### Names and Identifiers
-An **_identifier_** is a character sequence that
-identifies a _function_, _markup_, or _option_.
-Each _identifier_ consists of a _name_ optionally preceeded by
-a _namespace_.
-When present, the _namespace_ is separated from the _name_ by a
-U+003A COLON `:`.
-Built-in _functions_ and their _options_ do not have a _namespace_ identifier.
-
-The _namespace_ `u` (U+0075 LATIN SMALL LETTER U)
-is reserved for future standardization.
-
-_Function_ _identifiers_ are prefixed with `:`.
-_Markup_ _identifiers_ are prefixed with `#` or `/`.
-_Option_ _identifiers_ have no prefix.
-
A **_name_** is a character sequence used in an _identifier_
or as the name for a _variable_
or the value of an _unquoted literal_.
@@ -740,6 +745,20 @@ when matching _name_ or _identifier_ strings or _unquoted literal_ values.
_Variable_ _names_ are prefixed with `$`.
+Two _names_ are considered equal if they are canonically equivalent strings,
+that is, if they consist of the same sequence of Unicode code points after
+[Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC")
+has been applied to both.
+
+> [!NOTE]
+> Implementations are not required to normalize all _names_.
+> Comparisons of _name_ values only need be done "as-if" normalization
+> has occured.
+> Since most text in the wild is already in NFC
+> and since checking for NFC is fast and efficient,
+> implementations can often substitute checking for actually applying normalization
+> to _name_ values.
+
Valid content for _names_ is based on Namespaces in XML 1.0's
[NCName](https://www.w3.org/TR/xml-names/#NT-NCName).
This is different from XML's [Name](https://www.w3.org/TR/xml/#NT-Name)
@@ -751,6 +770,21 @@ Otherwise, the set of characters allowed in a _name_ is large.
> Such variables cannot be referenced in a _message_,
> but are not otherwise errors.
+An **_identifier_** is a character sequence that
+identifies a _function_, _markup_, or _option_.
+Each _identifier_ consists of a _name_ optionally preceeded by
+a _namespace_.
+When present, the _namespace_ is separated from the _name_ by a
+U+003A COLON `:`.
+Built-in _functions_ and their _options_ do not have a _namespace_ identifier.
+
+The _namespace_ `u` (U+0075 LATIN SMALL LETTER U)
+is reserved for future standardization.
+
+_Function_ _identifiers_ are prefixed with `:`.
+_Markup_ _identifiers_ are prefixed with `#` or `/`.
+_Option_ _identifiers_ have no prefix.
+
Examples:
> A variable:
>```