Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teach length::Bag how to switch hour cycles #840

Merged
merged 21 commits into from
Jul 28, 2021

Conversation

gregtatum
Copy link
Member

@gregtatum gregtatum commented Jun 30, 2021

Edit: This is now ready for review. Please ignore the first commit as it is #841, which I plan on landing first.

Resolves #671

Time zones are going to be required for full support here, as the generated patterns do not use the time zones. I commented out a check for unsupported symbols temporarily to make everything work, but the results can be seen in the generated data. What you'll see here is that I opted for generated 2 new sets of patterns, one for h11/h12, and one for h23/h24. This is in addition to the existing styles. the existing time style represents the localizer's choice for the best hour cycle. The time_h11_h12 and time_h23_24 represent the patterns for those specific preferences. There is a bit of duplicated data doing it this way, but I felt it was more correct as it allows localizers to specify any preference for hour cycle, and it also allows for specific overrides.

There is an assumption that a h11 and h12 symbol can be swapped out correctly in a pattern, as well as a h23 and h24 can be swapped. However, h11 cannot be swapped with with h23 etc.

I do the pattern generation code in the provider to pre-generate the data, rather than requiring the pattern generator to be run at runtime for styles. I anticipate a small regression to memory for this approach, as a trade-off of a bigger one if this pattern manipulation was done at runtime.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/datetime/src/provider/helpers.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@codecov-commenter
Copy link

codecov-commenter commented Jun 30, 2021

Codecov Report

Merging #840 (8e56e10) into main (cd4b7c5) will increase coverage by 0.53%.
The diff coverage is 95.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #840      +/-   ##
==========================================
+ Coverage   74.25%   74.78%   +0.53%     
==========================================
  Files         211      207       -4     
  Lines       12416    12507      +91     
==========================================
+ Hits         9219     9353     +134     
+ Misses       3197     3154      -43     
Impacted Files Coverage Δ
components/datetime/src/options/length.rs 25.00% <ø> (ø)
components/datetime/src/options/preferences.rs 58.33% <ø> (ø)
components/icu/src/lib.rs 100.00% <ø> (ø)
components/datetime/src/pattern/mod.rs 67.69% <83.33%> (+0.75%) ⬆️
...nents/datetime/src/pattern/transform_hour_cycle.rs 88.23% <88.23%> (ø)
provider/cldr/src/transform/dates/patterns.rs 85.20% <96.36%> (+5.20%) ⬆️
components/datetime/src/skeleton.rs 85.14% <97.05%> (+3.17%) ⬆️
components/datetime/src/provider/gregory.rs 64.00% <100.00%> (+0.98%) ⬆️
components/datetime/src/provider/helpers.rs 82.78% <100.00%> (+2.41%) ⬆️
provider/testdata/src/metadata.rs
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd4b7c5...8e56e10. Read the comment docs.

@coveralls
Copy link

coveralls commented Jun 30, 2021

Pull Request Test Coverage Report for Build fa138f514a7750f40317feec08c698f8a8540a5c-PR-840

  • 178 of 187 (95.19%) changed or added relevant lines in 6 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.4%) to 74.842%

Changes Missing Coverage Covered Lines Changed/Added Lines %
components/datetime/src/pattern/mod.rs 5 6 83.33%
components/datetime/src/skeleton.rs 66 68 97.06%
provider/cldr/src/transform/dates/patterns.rs 53 55 96.36%
components/datetime/src/pattern/transform_hour_cycle.rs 30 34 88.24%
Files with Coverage Reduction New Missed Lines %
components/datetime/src/options/length.rs 1 25.0%
Totals Coverage Status
Change from base Build cd4b7c536c0b573aa68510bc8e1776fa78b1c0fd: 0.4%
Covered Lines: 9484
Relevant Lines: 12672

💛 - Coveralls

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/datetime/src/fields/symbols.rs is now changed in the branch
  • components/datetime/src/options/components.rs is now changed in the branch
  • components/datetime/src/options/preferences.rs is now changed in the branch
  • components/datetime/src/pattern/mod.rs is different
  • components/datetime/src/pattern/parser.rs is now changed in the branch
  • components/datetime/src/provider/gregory.rs is now changed in the branch
  • components/datetime/src/provider/helpers.rs is different
  • components/datetime/src/skeleton.rs is different
  • components/datetime/tests/datetime.rs is different
  • components/datetime/tests/fixtures/tests/components-exact-matches.json is now changed in the branch
  • components/datetime/tests/fixtures/tests/lengths_with_zones_from_pdt.json is now changed in the branch
  • components/datetime/tests/fixtures/tests/skeletons.bin is now changed in the branch
  • provider/cldr/src/transform/dates/patterns.rs is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/ar-EG.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/ar.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/bn.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/ccp.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/en-001.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/en-ZA.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/en.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/es-AR.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/es.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/fr.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/ja.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/ru.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/sr-Cyrl.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/sr-Latn.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/sr.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/th.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/tr.json is now changed in the branch
  • provider/testdata/data/json/date_patterns/gregory@1/und.json is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/datetime/src/skeleton.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@gregtatum gregtatum marked this pull request as ready for review July 1, 2021 16:53
@gregtatum gregtatum requested review from sffc and zbraniecki as code owners July 1, 2021 16:53
@zbraniecki
Copy link
Member

Your new sets - time_h11_h12 and time_h23_h24 are always a pair of one duplicated and one non-duplicated. Could we skip the duplicated one?

@gregtatum
Copy link
Member Author

I took a stab at de-duplicating it. The risk is that lengths mix a h11h12 with a h23h24. In that case I have the provider transform panic. I checked the CLDR data, and no locales are currently doing this. It saves 120 bytes but adds a bit more complexity to the implementation.

commit total allocations max memory
de-duplicated 53,718 bytes (+197) 26,492 bytes (+197)
duplicated 53,838 bytes (+317) 26,612 bytes (+317)
main branch 53,521 bytes 26,295 bytes

components/datetime/src/provider/gregory.rs Outdated Show resolved Hide resolved
components/datetime/src/pattern/mod.rs Outdated Show resolved Hide resolved
components/datetime/src/pattern/mod.rs Outdated Show resolved Hide resolved
components/datetime/src/pattern/mod.rs Outdated Show resolved Hide resolved
components/datetime/src/skeleton.rs Show resolved Hide resolved
@gregtatum gregtatum requested a review from a team as a code owner July 2, 2021 16:50
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/datetime/src/fields/symbols.rs is no longer changed in the branch
  • components/datetime/src/options/components.rs is no longer changed in the branch
  • components/datetime/src/pattern/mod.rs is different
  • components/datetime/src/pattern/parser.rs is no longer changed in the branch
  • components/datetime/src/pattern/transform_hour_cycle.rs is different
  • components/datetime/src/skeleton.rs is different
  • components/datetime/tests/fixtures/tests/components-exact-matches.json is no longer changed in the branch
  • components/datetime/tests/fixtures/tests/lengths_with_zones_from_pdt.json is no longer changed in the branch
  • components/datetime/tests/fixtures/tests/skeletons.bin is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I like how you were able to support the hour cycle preferences at runtime with a very small amount of code, front-loading most of the work to the transformer.

Just some nits.

components/datetime/src/pattern/mod.rs Outdated Show resolved Hide resolved
components/datetime/Cargo.toml Outdated Show resolved Hide resolved
@gregtatum gregtatum dismissed stale reviews from sffc and zbraniecki via 5963720 July 22, 2021 13:55
@gregtatum
Copy link
Member Author

I did an ugly fix due to merging in #832. The widths for hour cycles were expanding in ways I didn't want them to. The get_best_available_format_pattern wasn't really giving me the results I wanted, as it would respect my width choice. I am debating whether it would be better to have get_best_available_format_pattern not take fields: &[Field] to operate on, but rather a special subset of the fields, defined as fields: &[SkeletonField].

The skeleton fields do not match 1:1 with pattern fields, and I feel like I'm constantly designing around that. I wonder if I shouldn't file a follow-up issue to tackle this.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/datetime/src/pattern/transform_hour_cycle.rs is different
  • components/datetime/src/provider/helpers.rs is different
  • components/datetime/src/skeleton.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@gregtatum
Copy link
Member Author

Apologies for the force push, but I consolidated my changes after the approval to make it easier to review. The changes are only e7ee7e3.

@gregtatum gregtatum requested a review from sffc July 22, 2021 15:50
sffc
sffc previously approved these changes Jul 22, 2021
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good; I defer to Zibi to review the details of the business logic

zbraniecki
zbraniecki previously approved these changes Jul 27, 2021
zbraniecki
zbraniecki previously approved these changes Jul 28, 2021
@gregtatum gregtatum merged commit 3d49bb1 into unicode-org:main Jul 28, 2021
@robertbastian
Copy link
Member

There is an assumption that a h11 and h12 symbol can be swapped out correctly in a pattern, as well as a h23 and h24 can be swapped. However, h11 cannot be swapped with with h23 etc.

This assumption seems to break in CLDR 43

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `H23H24`,
 right: `H11H12`: A locale contained a mix of coarse hour cycle types', provider/datagen/src/transform/cldr/datetime/patterns.rs:140:17

This was referenced Mar 14, 2023
@sffc
Copy link
Member

sffc commented Mar 16, 2023

Possibly related:

tc39/ecma402#758

@sffc
Copy link
Member

sffc commented Mar 20, 2023

@robertbastian What is the locale causing the error? What change in CLDR between 42 and 43 is causing the error? That information can help us decide the best approach to fixing it.

@robertbastian
Copy link
Member

Do we consider this a CLDR bug, or is this mixed assumption something we can drop?

@srl295
Copy link
Member

srl295 commented Mar 21, 2023

As with some other issues, byn and ssy aren’t at basic level yet. So in some sense, it’s not a bug, because the locale coverage mechanism has already warned you about the locale.

Looking at these specifically, the issue is that of the four timeFormats, two or three of them are unconfirmed—they need more votes. By default, unconfirmed are dropped from cldr-json. Therefore the patterns with H fall through from root.

I added a comment to https://unicode-org.atlassian.net/browse/CLDR-11836?focusedCommentId=168839 for this issue under logical grouping errors.

That said, this is also probably a reasonable assumption that hourcycles are consistent. I don't know off hand whether there's a separate test or ticket for that. That would be fine to file as a ticket, for next time (Cross reference the logical group issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hour Cycle Preferences ignored
8 participants