Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Variants to be a ShortVec<Variant> #1988

Merged
merged 2 commits into from
Jun 14, 2022
Merged

Conversation

yzhang1994
Copy link
Contributor

PR 1/2 for issue #1800

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not make ShortVec be public. I requested feedback from Zibi.

@zbraniecki
Copy link
Member

So, this will change Variants from Box<[Variant]> to Vec<Variant>. Is that what we want?

@sffc
Copy link
Member

sffc commented Jun 6, 2022

So, this will change Variants from Box<[Variant]> to Vec<Variant>. Is that what we want?

It changes to ShortVec<Variant>, not Vec<Variant>. ShortVec is optimized for vectors of length 0 and 1.

This is what we want because it enables us to const-construct more types of locales without needing a heap allocation.

@zbraniecki
Copy link
Member

SortVec is an enum that has Vec. Maybe what I'm asking is if we should consider having ShortStaticVec that has Empty, One, Box<[]> ?

@sffc
Copy link
Member

sffc commented Jun 6, 2022

SortVec is an enum that has Vec. Maybe what I'm asking is if we should consider having ShortStaticVec that has Empty, One, Box<[]> ?

This is internal, so we can switch in the future if a need arises.

That said, I recall similar questions coming up before, and my recollection of the conclusion was that we could not identify a concrete reason to favor Box<[T]> over Vec<T>. If there is evidence to the contrary, could you share it?

In any case, I hope that this can be in a follow-up.

@yzhang1994 yzhang1994 marked this pull request as ready for review June 8, 2022 23:55
@yzhang1994 yzhang1994 requested a review from nciric as a code owner June 8, 2022 23:55
@yzhang1994 yzhang1994 requested a review from sffc June 8, 2022 23:56
@Manishearth Manishearth removed the request for review from nciric June 14, 2022 22:50
@sffc sffc merged commit d094e04 into unicode-org:main Jun 14, 2022
@hsivonen
Copy link
Member

Looks like CI has been failing since this PR landed.

@hsivonen
Copy link
Member

The merge annotation here says "27 checks passed", but https://github.com/unicode-org/icu4x/commits/main shows a red x next to the changeset landed here.

@robertbastian
Copy link
Member

I added this test in #1950, and the CI on this PR had ran before that, so it only go caught by the post-merge CI.

That said, I added this accidentally, I only wanted to see how big this struct actually was. Now that it's there though I guess we can keep it.

Now, do we want to reduce the size of it though? Over 200 bytes is a lot for something that is usually less than 10 bytes of actual data.

@zbraniecki
Copy link
Member

Now, do we want to reduce the size of it though? Over 200 bytes is a lot for something that is usually less than 10 bytes of actual data.

I'd love to try to fix it. I think it's the hottest item in the whole ICU4X.

@zbraniecki
Copy link
Member

can we consider reverting this as well?

@sffc
Copy link
Member

sffc commented Jun 16, 2022

can we consider reverting this as well?

We haven't discussed stack size of locale as being an issue before. This PR changed the stack size from 184 (already quite big) to 216, which is only a 17% increase. Reverting this PR will not fix the underlying problem, which is a deeply nested struct layout that probably results in a lot of wasted bytes.

If we want to start measuring the stack size of Locale, and in general of other ICU4X types, then that should be the subject of another issue.

FYI, the purpose of this PR is to make more things in Locale be const-constructible, which is an impactful issue to solve because we can start writing locale!("ca-ES-valencia") and eventually locale!("en-u-ca-hebrew"). This benefits developer experience and i18n correctness since it encourages the use of extension keywords.

We're already optimizing for a lot of things (performance, code size, heap memory, developer experience, i18n correctness), so before we add another thing to that list, we should make sure we can justify why it is an impactful thing to measure.

@zbraniecki
Copy link
Member

The tensions we're evaluating here is on the triangular spectrum between memory, functionality and performance.

In case of Locale we have a bimodal distribution of how common parts of the Locale are used. We can use this information to inform where we place ourselves on the tradeoffs - although we should be careful not to create self-fulfilling prophecies. I believe that the ICU4C API and ECMA-402 API create quite a few so when looking at past experiences we should be careful when what is possible/easy happens to be also common.

Back to this issue. Large software (Firefox) will operate on large volume of locales - either directly around the codebase, or indirectly via ICU4X requests that internally will operate on them.
When I was starting the foundational component of ICU4X - unic-locale crate, I collected data on locales created during startup of Gecko+Firefox.
If I remember correctly there were close to 1000 locales created.

Almost all of them will be repetitive "en-US", "de" and "fr" types (so, actually LanguageIdentifiers), as they load a list of 100 locales available to negotiate against 1-3 that the user requested. The requested ones can get more sophisticated as unicode extensions may be stored in user preferences, but the 100-[1-3] ratio I think is going to be quite common.

My rule-of-thumb here is that there is an order of priorities when considering speed/size/functionality - Language, right after it Region, then much further down Script, and much much much further down Variants and Extensions.

If we want to introduce a single threshold, I think LSR deserves all the attention, and Variants+Extensions can be slow and assumed to be rare/exceptional.

For that reason I'm comfortable with focusing on const-constructed, hyper-optimized LSR components, and would prefer to minimize the cost of carrying variants+extensions around.

You are correct that we never specified that when discussing memory cost we mean heap or stack or both but in all cases, I'd like to try to make sure that when we create 1000 locales, we minimize the cost, both in heap and stack, and paying for variants, when they are exceptions, seems like tipping us out of the golden-spot that my mental model leads me to.

I'm comfortable with that being questioned, and maybe there is a particular value in having variants be const-constructed, but I'd like to avoid paying increased stack size of Locale for const-constructed subtags that are almost never used and are not required for optimal performance on hot path.

This feels like a wrong tradeoff outcome here, but my position is weakly held.
Does anyone offer different mental model, particular use case, or believes that the tradeoff should be accepted here?

@sffc
Copy link
Member

sffc commented Jun 16, 2022

FYI, the large majority of the stack size of Locale is extensions::Extensions. I added more size tests to demonstrate: #2078

It's true that this PR likely increases the stack size of Variants, which likely impacts LanguageIdentifier to a greater percentage degree than Locale. I may do a quick check to see if there's a way to mitigate that cost in ShortVec.

@sffc
Copy link
Member

sffc commented Jun 16, 2022

A quick investigation suggests that using Box<[T]> instead of Vec<T>, as @zbraniecki suggested, positively improves stack size (presumably by removing the capacity slot). We can get Locale down from 216 to 200 by making this change, or 50% of the regression introduced by using ShortVec.

@sffc
Copy link
Member

sffc commented Jun 16, 2022

I filed an issue explaining how we can fully eliminate the stack size regression without sacrificing functionality.

#2084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants