Modify patterns from a skeleton match to have the correct width #832

gregtatum · 2021-06-28T17:09:54Z

Resolves #584.

coveralls · 2021-06-28T17:31:41Z

Pull Request Test Coverage Report for Build a51364a4f74669190872dd6e0ea8e3bd91dac4bb-PR-832

20 of 23 (86.96%) changed or added relevant lines in 1 file are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.06%) to 74.938%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
components/datetime/src/skeleton.rs	20	23	86.96%

Files with Coverage Reduction	New Missed Lines	%
components/datetime/src/skeleton.rs	2	82.04%

Totals
Change from base Build 67bd340dd7cb6e1a958ceb36f8f1d4e73c63742e:	0.06%
Covered Lines:	9694
Relevant Lines:	12936

💛 - Coveralls

codecov-commenter · 2021-06-29T18:16:23Z

Codecov Report

Merging #832 (2ba7ad7) into main (67bd340) will increase coverage by 0.06%.
The diff coverage is 86.95%.

@@            Coverage Diff             @@
##             main     #832      +/-   ##
==========================================
+ Coverage   74.80%   74.86%   +0.06%     
==========================================
  Files         198      198              
  Lines       12762    12778      +16     
==========================================
+ Hits         9546     9566      +20     
+ Misses       3216     3212       -4

Impacted Files	Coverage Δ
components/datetime/src/options/components.rs	`76.72% <ø> (+0.86%)`	⬆️
components/datetime/src/skeleton.rs	`82.03% <86.95%> (+0.47%)`	⬆️
experimental/provider_ppucd/src/parse_ppucd.rs	`93.13% <0.00%> (+0.13%)`	⬆️
components/datetime/src/fields/symbols.rs	`68.82% <0.00%> (+0.76%)`	⬆️
components/datetime/src/provider/helpers.rs	`80.18% <0.00%> (+0.94%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 67bd340...2ba7ad7. Read the comment docs.

…skeleton match to have the correct width"

components/datetime/src/skeleton.rs

…skeleton match to have the correct width"

sffc

Not really requesting changes, but I have a question before I approve

sffc · 2021-07-20T23:53:33Z

components/datetime/src/skeleton.rs

+                // There's no match, or this is a string literal return the original item.
+                item.clone()
+            })
+            .collect::<Vec<PatternItem>>(),


Thought: I wish we could do this without alloc. I liked how before this function was basically a projection from &'a SkeletonsV1 to &'a Pattern. The extra allocation is going to hurt basically every benchmark we're after: memory usage, code size, and performance.

Suggestion: Could this be architected in a way that involves a map on the pattern itself? Like, you could attach a mapping function to the pattern, such that when you loop over the pattern while formatting, you run the PatternItems through the mapping function.

Alternatively, is this in the part of the code that we won't use at runtime after the new CLDR 70 skeleton algorithm is implemented? I don't care about the Vec if it's run in the transformer.

The extra allocation is going to hurt basically every benchmark we're after.

Yes, I agree that the allocation isn't great, and what I tried to design around in the initial implementation.

Suggestion: Could this be architected in a way that involves a map on the pattern itself.

The pattern must be mutated in order to support all of the features for DateTimeFormat. Beyond just the widths, there's also the hour cycle to consider. There's also append items support for things like time zones or week. The latter is even trickier, as a simple mapping function wouldn't work.

I would prefer to accept the perf/memory regression for 0.3, and follow-up with exploring some approaches. Especially as pattern mutation is a new requirement for other features such as the hour cycle, and append items. Then for 0.4 I can track fixing it.

Alternatively, is this in the part of the code that we won't use at runtime after the new CLDR 70 skeleton algorithm is implemented? I don't care about the Vec if it's run in the transformer.

I don't think well want a literal pattern for every width adjustment, as it will greatly affect the provider data. I'd say let's follow-up with looking for perf wins here. I'm nervous about the trade off between runtime characteristics and data payload size. I think it's something we should definitely try and work out.

I filed #877 as a follow-up, if you are OK with moving forward with this patch.

Okay. I won't block this PR but I'm a bit disappointed that we're going in this direction. I feel like we're repeating the problems of ICU4C rather than solving them. I hope we can follow up in #877.

Well, this at least gets us a baseline metric that we can track, and tests in place that make sure we are compliant with the behavior. The code is mutable, so we can course correct to fewer allocations. I also filed #879 to make sure we track this area better. I realize now that the benchmark is still needed.

Follow-up issue

gregtatum added 4 commits June 28, 2021 11:50

Remove outdated TODO comments

b460b86

Add a mechanism to map the Pattern

6f6ad42

Implement width expansion

aeed70b

Add more tests enumerating the options for month length expansion

72ca6de

gregtatum requested a review from sffc June 28, 2021 17:09

gregtatum requested a review from zbraniecki as a code owner June 28, 2021 17:09

Fix clippy errors

2ba7ad7

gregtatum mentioned this pull request Jul 2, 2021

Add initial support for timezones in component::Bag #845

Merged

gregtatum added a commit to gregtatum/icu4x that referenced this pull request Jul 2, 2021

SQUASHED unicode-org#832 - Ignore this, it's "Modify patterns from a …

3319f89

…skeleton match to have the correct width"

gregtatum mentioned this pull request Jul 2, 2021

Correctly apply the hour cycle in the components::Bag #846

Merged

zbraniecki approved these changes Jul 2, 2021

View reviewed changes

components/datetime/src/skeleton.rs Show resolved Hide resolved

gregtatum added a commit to gregtatum/icu4x that referenced this pull request Jul 12, 2021

SQUASHED unicode-org#832 - Ignore this, it's "Modify patterns from a …

3881d35

…skeleton match to have the correct width"

sffc added this to the ICU4X 0.3 milestone Jul 15, 2021

sffc previously requested changes Jul 20, 2021

View reviewed changes

gregtatum mentioned this pull request Jul 21, 2021

Explore fixes to perf/memory regressions around cloning/mutating patterns in components::Bag #877

Open

gregtatum mentioned this pull request Jul 21, 2021

Create benchmarks for components::Bag #879

Closed

gregtatum merged commit 4052c2e into unicode-org:main Jul 21, 2021

gregtatum mentioned this pull request Jul 22, 2021

Teach length::Bag how to switch hour cycles #840

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify patterns from a skeleton match to have the correct width #832

Modify patterns from a skeleton match to have the correct width #832

gregtatum commented Jun 28, 2021

coveralls commented Jun 28, 2021 •

edited

Loading

codecov-commenter commented Jun 29, 2021

sffc left a comment

sffc Jul 20, 2021

gregtatum Jul 21, 2021

gregtatum Jul 21, 2021

sffc Jul 21, 2021

gregtatum Jul 21, 2021

Modify patterns from a skeleton match to have the correct width #832

Modify patterns from a skeleton match to have the correct width #832

Conversation

gregtatum commented Jun 28, 2021

coveralls commented Jun 28, 2021 • edited Loading

Pull Request Test Coverage Report for Build a51364a4f74669190872dd6e0ea8e3bd91dac4bb-PR-832

💛 - Coveralls

codecov-commenter commented Jun 29, 2021

Codecov Report

sffc left a comment

Choose a reason for hiding this comment

sffc Jul 20, 2021

Choose a reason for hiding this comment

gregtatum Jul 21, 2021

Choose a reason for hiding this comment

gregtatum Jul 21, 2021

Choose a reason for hiding this comment

sffc Jul 21, 2021

Choose a reason for hiding this comment

gregtatum Jul 21, 2021

Choose a reason for hiding this comment

coveralls commented Jun 28, 2021 •

edited

Loading