Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Modernise spec to use structured headers and correct number representations #822

Merged
merged 23 commits into from
Oct 12, 2023

Conversation

anba
Copy link
Contributor

@anba anba commented Aug 9, 2023

No description provided.

@ptomato
Copy link
Contributor

ptomato commented Aug 9, 2023

Not a review, but I just wanted to say a heartfelt thank you for doing this ❤️

spec/numberformat.html Outdated Show resolved Hide resolved
Copy link
Member

@ryzokuken ryzokuken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall, thanks a lot for this! I made a few comments but nothing too serious.

spec/displaynames.html Show resolved Hide resolved
spec/listformat.html Show resolved Hide resolved
spec/listformat.html Show resolved Hide resolved
spec/listformat.html Show resolved Hide resolved
spec/locale-sensitive-functions.html Show resolved Hide resolved
spec/pluralrules.html Show resolved Hide resolved
spec/numberformat.html Show resolved Hide resolved
spec/negotiation.html Show resolved Hide resolved
spec/locales-currencies-tz.html Outdated Show resolved Hide resolved
spec/locales-currencies-tz.html Show resolved Hide resolved
Copy link
Contributor

@ben-allen ben-allen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One incredibly small nit -- there's a missing word ("according pattern" instead of "according to pattern") in FormatDateTimePattern

spec/segmenter.html Show resolved Hide resolved
spec/segmenter.html Show resolved Hide resolved
spec/datetimeformat.html Show resolved Hide resolved
@ben-allen
Copy link
Contributor

@gibson042 Any objections to merging this one?

Copy link
Contributor

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is magnificent! I have a number of suggestions, almost all of which cluster into integer vs. integral Number and how to reference tabular data.

spec/datetimeformat.html Outdated Show resolved Hide resolved
spec/datetimeformat.html Show resolved Hide resolved
spec/datetimeformat.html Show resolved Hide resolved
spec/datetimeformat.html Show resolved Hide resolved
spec/datetimeformat.html Outdated Show resolved Hide resolved
spec/numberformat.html Outdated Show resolved Hide resolved
spec/numberformat.html Outdated Show resolved Hide resolved
spec/pluralrules.html Outdated Show resolved Hide resolved
spec/pluralrules.html Show resolved Hide resolved
spec/pluralrules.html Outdated Show resolved Hide resolved
@ben-allen
Copy link
Contributor

@gibson042 Do you think this one is mergeable as stands?

@gibson042
Copy link
Contributor

Yes, but there would be substantial followup churn.

anba added a commit to anba/ecma402 that referenced this pull request Sep 25, 2023
Add new `GetLocale{Language,Script,Region,Variants}` operations to
retrieve the corresponding subtags from a locale tag.

These new operations are used in `ApplyOptionsToTag`,
`IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype`
accessor functions.

GetLocaleLanguage:
Returns the longest prefix matching `unicode_language_subtag`. The
previous definitions could be misinterpreted to match variant subtags
whose length is larger than the language subtag. For example in "en-basiceng"`
the longest substring matching `unicode_language_subtag` is "basiceng".

GetLocaleScript:
The previous definition from `Intl.Locale.prototype.script` is reused.

GetLocaleRegion:
Instead of using the previous definition from
`Intl.Locale.prototype.region`, it was rewritten to match definition from
`GetLocaleScript` a bit more closely. To not confuse language and region
subtags, the leading language subtag is first removed before searching for
`unicode_region_subtag`.

GetLocaleVariants:
Uses the suggestion from code review in tc39#822. The leading "-" character
is removed for consistency with the other three new operations.

`get Intl.prototype.{language,script,region}` are now all simply calling
the new abstract operations to retrieve the subtags.

`ApplyOptionsToTag` uses the new operations to retrieve the subtags from
the original language tag when the corresponding option is absent. The
updated `languageId` is now manually constructed through string
concatenation instead of using subtag matching.

`IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to
retrieve the variant subtags. The variable `lang` was renamed to
`languageId` for consistency with the rest of the spec and because
`lang` can be more easily misinterpreted to stand for "language".

`CanonicalizeUnicodeLocaleId` was changed to fix the incorrect
redeclaration warning for `extension` from ecmarkup:
- Instead of using yet another way to retrieve the Unicode extension
  sequence, simply use the existing terms "Unicode locale extension
  sequence". (The existing term already makes sure that substrings in
  private-use subtags are ignored, so we don't have to worry about
  `pu_extensions`.
- "Unicode locale extension sequences" include the leading "-"
  character, so `newExtension` actually needs to be initialised with
  "-u".
@anba
Copy link
Contributor Author

anba commented Sep 25, 2023

Fixed all review suggestions (except the suggestion to use Number values in internal slots).

I've also updated

  • 06ce846 to the newest ecma262-biblio
  • 9a24c80 to the newest ecmarkup. Updating ecmarkup to version 18 required three additional changes:
    1. --css-out and --js-out options were removed from build options, instead --assets-dir has to be used. (--assets-dir seems to default to directory of the output file, so we could also simply not specify it.)
    2. Spec enums are now required to be in kebab-case. For example ~more-precision~ instead of ~morePrecision~.
    3. "ecmarkup.css" and "ecmarkup.js" are now automatically included, explicitly including them in "spec/index.html" triggers a build error.
  • 87cb0a7 which adds new abstract operations to extract language, script, and region subtags from locale tags. The new operations are used to address the previous review comments about working around incorrect ecmarkup redeclaration errors.

(Note: Updating ecmarkup also leads to using the new font.)

Copy link
Contributor

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more suggestions, but this looks good to me. Thanks @anba!

spec/locale.html Outdated Show resolved Hide resolved
spec/locale.html Outdated Show resolved Hide resolved
spec/locale.html Outdated Show resolved Hide resolved
spec/numberformat.html Outdated Show resolved Hide resolved
anba added a commit to anba/ecma402 that referenced this pull request Sep 27, 2023
Add new `GetLocale{Language,Script,Region,Variants}` operations to
retrieve the corresponding subtags from a locale tag.

These new operations are used in `ApplyOptionsToTag`,
`IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype`
accessor functions.

GetLocaleLanguage:
Returns the longest prefix matching `unicode_language_subtag`. The
previous definitions could be misinterpreted to match variant subtags
whose length is larger than the language subtag. For example in "en-basiceng"`
the longest substring matching `unicode_language_subtag` is "basiceng".

GetLocaleScript:
The previous definition from `Intl.Locale.prototype.script` is reused.

GetLocaleRegion:
Instead of using the previous definition from
`Intl.Locale.prototype.region`, it was rewritten to match definition from
`GetLocaleScript` a bit more closely. To not confuse language and region
subtags, the leading language subtag is first removed before searching for
`unicode_region_subtag`.

GetLocaleVariants:
Uses the suggestion from code review in tc39#822. The leading "-" character
is removed for consistency with the other three new operations.

`get Intl.prototype.{language,script,region}` are now all simply calling
the new abstract operations to retrieve the subtags.

`ApplyOptionsToTag` uses the new operations to retrieve the subtags from
the original language tag when the corresponding option is absent. The
updated `languageId` is now manually constructed through string
concatenation instead of using subtag matching.

`IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to
retrieve the variant subtags. The variable `lang` was renamed to
`languageId` for consistency with the rest of the spec and because
`lang` can be more easily misinterpreted to stand for "language".

`CanonicalizeUnicodeLocaleId` was changed to fix the incorrect
redeclaration warning for `extension` from ecmarkup:
- Instead of using yet another way to retrieve the Unicode extension
  sequence, simply use the existing terms "Unicode locale extension
  sequence". (The existing term already makes sure that substrings in
  private-use subtags are ignored, so we don't have to worry about
  `pu_extensions`.
- "Unicode locale extension sequences" include the leading "-"
  character, so `newExtension` actually needs to be initialised with
  "-u".
@anba
Copy link
Contributor Author

anba commented Sep 27, 2023

A few more suggestions, but this looks good to me.

Thanks, all fixed now.

anba added a commit to anba/ecma402 that referenced this pull request Sep 28, 2023
Add new `GetLocale{Language,Script,Region,Variants}` operations to
retrieve the corresponding subtags from a locale tag.

These new operations are used in `ApplyOptionsToTag`,
`IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype`
accessor functions.

GetLocaleLanguage:
Returns the longest prefix matching `unicode_language_subtag`. The
previous definitions could be misinterpreted to match variant subtags
whose length is larger than the language subtag. For example in "en-basiceng"`
the longest substring matching `unicode_language_subtag` is "basiceng".

GetLocaleScript:
The previous definition from `Intl.Locale.prototype.script` is reused.

GetLocaleRegion:
Instead of using the previous definition from
`Intl.Locale.prototype.region`, it was rewritten to match definition from
`GetLocaleScript` a bit more closely. To not confuse language and region
subtags, the leading language subtag is first removed before searching for
`unicode_region_subtag`.

GetLocaleVariants:
Uses the suggestion from code review in tc39#822. The leading "-" character
is removed for consistency with the other three new operations.

`get Intl.prototype.{language,script,region}` are now all simply calling
the new abstract operations to retrieve the subtags.

`ApplyOptionsToTag` uses the new operations to retrieve the subtags from
the original language tag when the corresponding option is absent. The
updated `languageId` is now manually constructed through string
concatenation instead of using subtag matching.

`IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to
retrieve the variant subtags. The variable `lang` was renamed to
`languageId` for consistency with the rest of the spec and because
`lang` can be more easily misinterpreted to stand for "language".

`CanonicalizeUnicodeLocaleId` was changed to fix the incorrect
redeclaration warning for `extension` from ecmarkup:
- Instead of using yet another way to retrieve the Unicode extension
  sequence, simply use the existing terms "Unicode locale extension
  sequence". (The existing term already makes sure that substrings in
  private-use subtags are ignored, so we don't have to worry about
  `pu_extensions`.
- "Unicode locale extension sequences" include the leading "-"
  character, so `newExtension` actually needs to be initialised with
  "-u".
anba added 20 commits October 4, 2023 17:18
Add new `GetLocale{Language,Script,Region,Variants}` operations to
retrieve the corresponding subtags from a locale tag.

These new operations are used in `ApplyOptionsToTag`,
`IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype`
accessor functions.

GetLocaleLanguage:
Returns the longest prefix matching `unicode_language_subtag`. The
previous definitions could be misinterpreted to match variant subtags
whose length is larger than the language subtag. For example in "en-basiceng"`
the longest substring matching `unicode_language_subtag` is "basiceng".

GetLocaleScript:
The previous definition from `Intl.Locale.prototype.script` is reused.

GetLocaleRegion:
Instead of using the previous definition from
`Intl.Locale.prototype.region`, it was rewritten to match definition from
`GetLocaleScript` a bit more closely. To not confuse language and region
subtags, the leading language subtag is first removed before searching for
`unicode_region_subtag`.

GetLocaleVariants:
Uses the suggestion from code review in tc39#822. The leading "-" character
is removed for consistency with the other three new operations.

`get Intl.prototype.{language,script,region}` are now all simply calling
the new abstract operations to retrieve the subtags.

`ApplyOptionsToTag` uses the new operations to retrieve the subtags from
the original language tag when the corresponding option is absent. The
updated `languageId` is now manually constructed through string
concatenation instead of using subtag matching.

`IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to
retrieve the variant subtags. The variable `lang` was renamed to
`languageId` for consistency with the rest of the spec and because
`lang` can be more easily misinterpreted to stand for "language".

`CanonicalizeUnicodeLocaleId` was changed to fix the incorrect
redeclaration warning for `extension` from ecmarkup:
- Instead of using yet another way to retrieve the Unicode extension
  sequence, simply use the existing terms "Unicode locale extension
  sequence". (The existing term already makes sure that substrings in
  private-use subtags are ignored, so we don't have to worry about
  `pu_extensions`.
- "Unicode locale extension sequences" include the leading "-"
  character, so `newExtension` actually needs to be initialised with
  "-u".
@anba
Copy link
Contributor Author

anba commented Oct 4, 2023

Updated again to resolve merge conflicts with latest changes to the main branch.

@ben-allen ben-allen merged commit 5afbb89 into tc39:master Oct 12, 2023
2 checks passed
@ben-allen
Copy link
Contributor

Thank you again so much for this!

ben-allen pushed a commit to ben-allen/ecma402 that referenced this pull request Oct 12, 2023
…r representations (tc39#822)

Sweeping changes across the entire spec to update ECMA-402 to use structured headers as used in ECMA-262

Also contains related refactoring, plus updates to represent numbers correctly/consistently
ryzokuken added a commit to ryzokuken/ecma402 that referenced this pull request Oct 12, 2023
…ct number representations (tc39#822)"

This reverts commit 5afbb89.

We reverted it to reuse the chunked commits anba initially posted.
ryzokuken pushed a commit to ryzokuken/ecma402 that referenced this pull request Oct 12, 2023
Add new `GetLocale{Language,Script,Region,Variants}` operations to
retrieve the corresponding subtags from a locale tag.

These new operations are used in `ApplyOptionsToTag`,
`IsStructurallyValidLanguageTag`, and the `Intl.Locale.prototype`
accessor functions.

GetLocaleLanguage:
Returns the longest prefix matching `unicode_language_subtag`. The
previous definitions could be misinterpreted to match variant subtags
whose length is larger than the language subtag. For example in "en-basiceng"`
the longest substring matching `unicode_language_subtag` is "basiceng".

GetLocaleScript:
The previous definition from `Intl.Locale.prototype.script` is reused.

GetLocaleRegion:
Instead of using the previous definition from
`Intl.Locale.prototype.region`, it was rewritten to match definition from
`GetLocaleScript` a bit more closely. To not confuse language and region
subtags, the leading language subtag is first removed before searching for
`unicode_region_subtag`.

GetLocaleVariants:
Uses the suggestion from code review in tc39#822. The leading "-" character
is removed for consistency with the other three new operations.

`get Intl.prototype.{language,script,region}` are now all simply calling
the new abstract operations to retrieve the subtags.

`ApplyOptionsToTag` uses the new operations to retrieve the subtags from
the original language tag when the corresponding option is absent. The
updated `languageId` is now manually constructed through string
concatenation instead of using subtag matching.

`IsStructurallyValidLanguageTag` now calls `GetLocaleVariants` to
retrieve the variant subtags. The variable `lang` was renamed to
`languageId` for consistency with the rest of the spec and because
`lang` can be more easily misinterpreted to stand for "language".

`CanonicalizeUnicodeLocaleId` was changed to fix the incorrect
redeclaration warning for `extension` from ecmarkup:
- Instead of using yet another way to retrieve the Unicode extension
  sequence, simply use the existing terms "Unicode locale extension
  sequence". (The existing term already makes sure that substrings in
  private-use subtags are ignored, so we don't have to worry about
  `pu_extensions`.
- "Unicode locale extension sequences" include the leading "-"
  character, so `newExtension` actually needs to be initialised with
  "-u".
ben-allen pushed a commit to tc39/proposal-intl-duration-format that referenced this pull request Oct 16, 2023
@anba anba deleted the structured branch June 20, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants