Editorial: Modernise spec to use structured headers and correct number representations #822

anba · 2023-08-09T16:13:25Z

No description provided.

spec/segmenter.html

ptomato · 2023-08-09T16:22:12Z

Not a review, but I just wanted to say a heartfelt thank you for doing this ❤️

spec/negotiation.html

spec/numberformat.html

ryzokuken

Looks great overall, thanks a lot for this! I made a few comments but nothing too serious.

spec/displaynames.html

spec/listformat.html

spec/locale-sensitive-functions.html

spec/pluralrules.html

spec/numberformat.html

spec/negotiation.html

spec/locales-currencies-tz.html

See also <tc39/ecma402#822>.

ben-allen

One incredibly small nit -- there's a missing word ("according pattern" instead of "according to pattern") in FormatDateTimePattern

spec/segmenter.html

spec/datetimeformat.html

ben-allen · 2023-09-13T08:31:32Z

@gibson042 Any objections to merging this one?

gibson042

This is magnificent! I have a number of suggestions, almost all of which cluster into integer vs. integral Number and how to reference tabular data.

spec/datetimeformat.html

spec/numberformat.html

spec/pluralrules.html

ben-allen · 2023-09-23T01:05:01Z

@gibson042 Do you think this one is mergeable as stands?

gibson042 · 2023-09-24T06:40:28Z

Yes, but there would be substantial followup churn.

Add new `GetLocale{Language,Script,Region,Variants}` operations to retrieve the corresponding subtags from a locale tag. These new operations are used in `ApplyOptionsToTag`, `IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype` accessor functions. GetLocaleLanguage: Returns the longest prefix matching `unicode_language_subtag`. The previous definitions could be misinterpreted to match variant subtags whose length is larger than the language subtag. For example in "en-basiceng"` the longest substring matching `unicode_language_subtag` is "basiceng". GetLocaleScript: The previous definition from `Intl.Locale.prototype.script` is reused. GetLocaleRegion: Instead of using the previous definition from `Intl.Locale.prototype.region`, it was rewritten to match definition from `GetLocaleScript` a bit more closely. To not confuse language and region subtags, the leading language subtag is first removed before searching for `unicode_region_subtag`. GetLocaleVariants: Uses the suggestion from code review in tc39#822. The leading "-" character is removed for consistency with the other three new operations. `get Intl.prototype.{language,script,region}` are now all simply calling the new abstract operations to retrieve the subtags. `ApplyOptionsToTag` uses the new operations to retrieve the subtags from the original language tag when the corresponding option is absent. The updated `languageId` is now manually constructed through string concatenation instead of using subtag matching. `IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to retrieve the variant subtags. The variable `lang` was renamed to `languageId` for consistency with the rest of the spec and because `lang` can be more easily misinterpreted to stand for "language". `CanonicalizeUnicodeLocaleId` was changed to fix the incorrect redeclaration warning for `extension` from ecmarkup: - Instead of using yet another way to retrieve the Unicode extension sequence, simply use the existing terms "Unicode locale extension sequence". (The existing term already makes sure that substrings in private-use subtags are ignored, so we don't have to worry about `pu_extensions`. - "Unicode locale extension sequences" include the leading "-" character, so `newExtension` actually needs to be initialised with "-u".

anba · 2023-09-25T15:06:26Z

Fixed all review suggestions (except the suggestion to use Number values in internal slots).

I've also updated

06ce846 to the newest ecma262-biblio
9a24c80 to the newest ecmarkup. Updating ecmarkup to version 18 required three additional changes:
1. --css-out and --js-out options were removed from build options, instead --assets-dir has to be used. (--assets-dir seems to default to directory of the output file, so we could also simply not specify it.)
2. Spec enums are now required to be in kebab-case. For example ~more-precision~ instead of ~morePrecision~.
3. "ecmarkup.css" and "ecmarkup.js" are now automatically included, explicitly including them in "spec/index.html" triggers a build error.
87cb0a7 which adds new abstract operations to extract language, script, and region subtags from locale tags. The new operations are used to address the previous review comments about working around incorrect ecmarkup redeclaration errors.

(Note: Updating ecmarkup also leads to using the new font.)

gibson042

A few more suggestions, but this looks good to me. Thanks @anba!

spec/locale.html

spec/numberformat.html

Add new `GetLocale{Language,Script,Region,Variants}` operations to retrieve the corresponding subtags from a locale tag. These new operations are used in `ApplyOptionsToTag`, `IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype` accessor functions. GetLocaleLanguage: Returns the longest prefix matching `unicode_language_subtag`. The previous definitions could be misinterpreted to match variant subtags whose length is larger than the language subtag. For example in "en-basiceng"` the longest substring matching `unicode_language_subtag` is "basiceng". GetLocaleScript: The previous definition from `Intl.Locale.prototype.script` is reused. GetLocaleRegion: Instead of using the previous definition from `Intl.Locale.prototype.region`, it was rewritten to match definition from `GetLocaleScript` a bit more closely. To not confuse language and region subtags, the leading language subtag is first removed before searching for `unicode_region_subtag`. GetLocaleVariants: Uses the suggestion from code review in tc39#822. The leading "-" character is removed for consistency with the other three new operations. `get Intl.prototype.{language,script,region}` are now all simply calling the new abstract operations to retrieve the subtags. `ApplyOptionsToTag` uses the new operations to retrieve the subtags from the original language tag when the corresponding option is absent. The updated `languageId` is now manually constructed through string concatenation instead of using subtag matching. `IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to retrieve the variant subtags. The variable `lang` was renamed to `languageId` for consistency with the rest of the spec and because `lang` can be more easily misinterpreted to stand for "language". `CanonicalizeUnicodeLocaleId` was changed to fix the incorrect redeclaration warning for `extension` from ecmarkup: - Instead of using yet another way to retrieve the Unicode extension sequence, simply use the existing terms "Unicode locale extension sequence". (The existing term already makes sure that substrings in private-use subtags are ignored, so we don't have to worry about `pu_extensions`. - "Unicode locale extension sequences" include the leading "-" character, so `newExtension` actually needs to be initialised with "-u".

anba · 2023-09-27T15:15:55Z

A few more suggestions, but this looks good to me.

Thanks, all fixed now.

spec/locales-currencies-tz.html

Add new `GetLocale{Language,Script,Region,Variants}` operations to retrieve the corresponding subtags from a locale tag. These new operations are used in `ApplyOptionsToTag`, `IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype` accessor functions. GetLocaleLanguage: Returns the longest prefix matching `unicode_language_subtag`. The previous definitions could be misinterpreted to match variant subtags whose length is larger than the language subtag. For example in "en-basiceng"` the longest substring matching `unicode_language_subtag` is "basiceng". GetLocaleScript: The previous definition from `Intl.Locale.prototype.script` is reused. GetLocaleRegion: Instead of using the previous definition from `Intl.Locale.prototype.region`, it was rewritten to match definition from `GetLocaleScript` a bit more closely. To not confuse language and region subtags, the leading language subtag is first removed before searching for `unicode_region_subtag`. GetLocaleVariants: Uses the suggestion from code review in tc39#822. The leading "-" character is removed for consistency with the other three new operations. `get Intl.prototype.{language,script,region}` are now all simply calling the new abstract operations to retrieve the subtags. `ApplyOptionsToTag` uses the new operations to retrieve the subtags from the original language tag when the corresponding option is absent. The updated `languageId` is now manually constructed through string concatenation instead of using subtag matching. `IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to retrieve the variant subtags. The variable `lang` was renamed to `languageId` for consistency with the rest of the spec and because `lang` can be more easily misinterpreted to stand for "language". `CanonicalizeUnicodeLocaleId` was changed to fix the incorrect redeclaration warning for `extension` from ecmarkup: - Instead of using yet another way to retrieve the Unicode extension sequence, simply use the existing terms "Unicode locale extension sequence". (The existing term already makes sure that substrings in private-use subtags are ignored, so we don't have to worry about `pu_extensions`. - "Unicode locale extension sequences" include the leading "-" character, so `newExtension` actually needs to be initialised with "-u".

…tl.PluralRules

…l slot

… references

Add new `GetLocale{Language,Script,Region,Variants}` operations to retrieve the corresponding subtags from a locale tag. These new operations are used in `ApplyOptionsToTag`, `IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype` accessor functions. GetLocaleLanguage: Returns the longest prefix matching `unicode_language_subtag`. The previous definitions could be misinterpreted to match variant subtags whose length is larger than the language subtag. For example in "en-basiceng"` the longest substring matching `unicode_language_subtag` is "basiceng". GetLocaleScript: The previous definition from `Intl.Locale.prototype.script` is reused. GetLocaleRegion: Instead of using the previous definition from `Intl.Locale.prototype.region`, it was rewritten to match definition from `GetLocaleScript` a bit more closely. To not confuse language and region subtags, the leading language subtag is first removed before searching for `unicode_region_subtag`. GetLocaleVariants: Uses the suggestion from code review in tc39#822. The leading "-" character is removed for consistency with the other three new operations. `get Intl.prototype.{language,script,region}` are now all simply calling the new abstract operations to retrieve the subtags. `ApplyOptionsToTag` uses the new operations to retrieve the subtags from the original language tag when the corresponding option is absent. The updated `languageId` is now manually constructed through string concatenation instead of using subtag matching. `IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to retrieve the variant subtags. The variable `lang` was renamed to `languageId` for consistency with the rest of the spec and because `lang` can be more easily misinterpreted to stand for "language". `CanonicalizeUnicodeLocaleId` was changed to fix the incorrect redeclaration warning for `extension` from ecmarkup: - Instead of using yet another way to retrieve the Unicode extension sequence, simply use the existing terms "Unicode locale extension sequence". (The existing term already makes sure that substrings in private-use subtags are ignored, so we don't have to worry about `pu_extensions`. - "Unicode locale extension sequences" include the leading "-" character, so `newExtension` actually needs to be initialised with "-u".

…neIdentifier

anba · 2023-10-04T15:47:31Z

Updated again to resolve merge conflicts with latest changes to the main branch.

ben-allen · 2023-10-12T08:27:33Z

Thank you again so much for this!

…r representations (tc39#822) Sweeping changes across the entire spec to update ECMA-402 to use structured headers as used in ECMA-262 Also contains related refactoring, plus updates to represent numbers correctly/consistently

…ct number representations (tc39#822)" This reverts commit 5afbb89. We reverted it to reuse the chunked commits anba initially posted.

Add new `GetLocale{Language,Script,Region,Variants}` operations to retrieve the corresponding subtags from a locale tag. These new operations are used in `ApplyOptionsToTag`, `IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype` accessor functions. GetLocaleLanguage: Returns the longest prefix matching `unicode_language_subtag`. The previous definitions could be misinterpreted to match variant subtags whose length is larger than the language subtag. For example in "en-basiceng"` the longest substring matching `unicode_language_subtag` is "basiceng". GetLocaleScript: The previous definition from `Intl.Locale.prototype.script` is reused. GetLocaleRegion: Instead of using the previous definition from `Intl.Locale.prototype.region`, it was rewritten to match definition from `GetLocaleScript` a bit more closely. To not confuse language and region subtags, the leading language subtag is first removed before searching for `unicode_region_subtag`. GetLocaleVariants: Uses the suggestion from code review in tc39#822. The leading "-" character is removed for consistency with the other three new operations. `get Intl.prototype.{language,script,region}` are now all simply calling the new abstract operations to retrieve the subtags. `ApplyOptionsToTag` uses the new operations to retrieve the subtags from the original language tag when the corresponding option is absent. The updated `languageId` is now manually constructed through string concatenation instead of using subtag matching. `IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to retrieve the variant subtags. The variable `lang` was renamed to `languageId` for consistency with the rest of the spec and because `lang` can be more easily misinterpreted to stand for "language". `CanonicalizeUnicodeLocaleId` was changed to fix the incorrect redeclaration warning for `extension` from ecmarkup: - Instead of using yet another way to retrieve the Unicode extension sequence, simply use the existing terms "Unicode locale extension sequence". (The existing term already makes sure that substrings in private-use subtags are ignored, so we don't have to worry about `pu_extensions`. - "Unicode locale extension sequences" include the leading "-" character, so `newExtension` actually needs to be initialised with "-u".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Editorial: Modernise spec to use structured headers and correct number representations #822

Editorial: Modernise spec to use structured headers and correct number representations #822

anba commented Aug 9, 2023

ptomato commented Aug 9, 2023

ryzokuken left a comment

ben-allen left a comment

ben-allen commented Sep 13, 2023

gibson042 left a comment

ben-allen commented Sep 23, 2023

gibson042 commented Sep 24, 2023

anba commented Sep 25, 2023

gibson042 left a comment

anba commented Sep 27, 2023

anba commented Oct 4, 2023

ben-allen commented Oct 12, 2023

Editorial: Modernise spec to use structured headers and correct number representations #822

Editorial: Modernise spec to use structured headers and correct number representations #822

Conversation

anba commented Aug 9, 2023

ptomato commented Aug 9, 2023

ryzokuken left a comment

Choose a reason for hiding this comment

ben-allen left a comment

Choose a reason for hiding this comment

ben-allen commented Sep 13, 2023

gibson042 left a comment

Choose a reason for hiding this comment

ben-allen commented Sep 23, 2023

gibson042 commented Sep 24, 2023

anba commented Sep 25, 2023

gibson042 left a comment

Choose a reason for hiding this comment

anba commented Sep 27, 2023

anba commented Oct 4, 2023

ben-allen commented Oct 12, 2023