-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Variants to be a ShortVec<Variant> #1988
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not make ShortVec
be public. I requested feedback from Zibi.
So, this will change Variants from |
It changes to This is what we want because it enables us to const-construct more types of locales without needing a heap allocation. |
|
This is internal, so we can switch in the future if a need arises. That said, I recall similar questions coming up before, and my recollection of the conclusion was that we could not identify a concrete reason to favor In any case, I hope that this can be in a follow-up. |
Looks like CI has been failing since this PR landed. |
The merge annotation here says "27 checks passed", but https://github.com/unicode-org/icu4x/commits/main shows a red x next to the changeset landed here. |
I added this test in #1950, and the CI on this PR had ran before that, so it only go caught by the post-merge CI. That said, I added this accidentally, I only wanted to see how big this struct actually was. Now that it's there though I guess we can keep it. Now, do we want to reduce the size of it though? Over 200 bytes is a lot for something that is usually less than 10 bytes of actual data. |
I'd love to try to fix it. I think it's the hottest item in the whole ICU4X. |
can we consider reverting this as well? |
We haven't discussed stack size of locale as being an issue before. This PR changed the stack size from 184 (already quite big) to 216, which is only a 17% increase. Reverting this PR will not fix the underlying problem, which is a deeply nested struct layout that probably results in a lot of wasted bytes. If we want to start measuring the stack size of Locale, and in general of other ICU4X types, then that should be the subject of another issue. FYI, the purpose of this PR is to make more things in Locale be const-constructible, which is an impactful issue to solve because we can start writing We're already optimizing for a lot of things (performance, code size, heap memory, developer experience, i18n correctness), so before we add another thing to that list, we should make sure we can justify why it is an impactful thing to measure. |
The tensions we're evaluating here is on the triangular spectrum between memory, functionality and performance. In case of Locale we have a bimodal distribution of how common parts of the Locale are used. We can use this information to inform where we place ourselves on the tradeoffs - although we should be careful not to create self-fulfilling prophecies. I believe that the ICU4C API and ECMA-402 API create quite a few so when looking at past experiences we should be careful when what is possible/easy happens to be also common. Back to this issue. Large software (Firefox) will operate on large volume of locales - either directly around the codebase, or indirectly via ICU4X requests that internally will operate on them. Almost all of them will be repetitive "en-US", "de" and "fr" types (so, actually LanguageIdentifiers), as they load a list of 100 locales available to negotiate against 1-3 that the user requested. The requested ones can get more sophisticated as unicode extensions may be stored in user preferences, but the 100-[1-3] ratio I think is going to be quite common. My rule-of-thumb here is that there is an order of priorities when considering speed/size/functionality - Language, right after it Region, then much further down Script, and much much much further down Variants and Extensions. If we want to introduce a single threshold, I think LSR deserves all the attention, and Variants+Extensions can be slow and assumed to be rare/exceptional. For that reason I'm comfortable with focusing on const-constructed, hyper-optimized LSR components, and would prefer to minimize the cost of carrying variants+extensions around. You are correct that we never specified that when discussing memory cost we mean heap or stack or both but in all cases, I'd like to try to make sure that when we create 1000 locales, we minimize the cost, both in heap and stack, and paying for variants, when they are exceptions, seems like tipping us out of the golden-spot that my mental model leads me to. I'm comfortable with that being questioned, and maybe there is a particular value in having variants be const-constructed, but I'd like to avoid paying increased stack size of Locale for const-constructed subtags that are almost never used and are not required for optimal performance on hot path. This feels like a wrong tradeoff outcome here, but my position is weakly held. |
FYI, the large majority of the stack size of Locale is It's true that this PR likely increases the stack size of |
A quick investigation suggests that using |
I filed an issue explaining how we can fully eliminate the stack size regression without sacrificing functionality. |
PR 1/2 for issue #1800