-
Notifications
You must be signed in to change notification settings - Fork 0
WIP: getDefaultLocale and getDefaultTimeZone #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dilijev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments on individual commits. Looks like you're on the right track!
85be649 to
f28deb8
Compare
| const auto locale = _u("en-US"); | ||
| const size_t len = 5; | ||
| if (lpLocaleName) | ||
| UErrorCode error = U_ZERO_ERROR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use UErrorCode::U_ZERO_ERROR to be explicit, and consistent with other places (I've made this consistent in my PR as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that be using a C++ enum class? Are we still going forward with adding more C++ until we resolve the system ICU issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't require an enum class but does require a C++ compiler (which we do). In C++ all enum values are namespace-scoped to their enclosing enum.
Side note: I'm not sure we can resolve the system ICU issue (might not be possible and we just have to build it in any case) but for now we rely on the C++ APIs.
In summary IMO we should still be consistent and use the scope.
| int32_t written = uloc_toLanguageTag(uloc_getDefault(), bcp47, ULOC_FULLNAME_CAPACITY, true, &error); | ||
|
|
||
| // REVIEW (jahorto): if getting the language tag for NULL (default) fails, is there any backup? | ||
| if (!U_SUCCESS(error) || written <= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use U_FAILURE
| locale: resolved, | ||
| subTags: subTags, | ||
| localeWithoutSubtags: resolved | ||
| localeWithoutSubtags: withoutSubTags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I moved the resolved logic out into its own if/else block, resolved here includes the subtags. withoutSubTags is just whatever resolved was before the if block runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Thanks.
| // making these vars in JS allows us to more change how we | ||
| // determine the backing library | ||
| let isICU = !platform.winglob; | ||
| let isWinGlob = platform.winglob; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick (Don't change anything unless you've already got another change that will cause you to regenerate bytecode anyway)
Maybe be more explicit with these names since they will be unscoped in Intl.js (with anything like platform.isICU). Maybe platform.winglob was unfortunate naming for a boolean as well, but don't update that.
I'd suggest isPlatformUsingICU or something.
| locale = platform.getDefaultLocale(); | ||
| const match = platform.builtInRegexMatch(locale, /-u(-[^\-][^\-]?-[^\-]+)*(-co-[^\-]+).*/); | ||
| if (match) { | ||
| collation = callInstanceFunc(StringInstanceReplace, match[2], "-co-", ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing another string operation, write the regex so that it captures the part you want, like this: (-co-([^\-]+)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what if you got a match but there was no matching group where you expected? -co- is forced to match, as well as the value, and we're assuming a well-formed locale from a system API, but you never know. Food for thought.
If you end up finding that there might a problem here: maybe add an extra check that the index is on the object before using the returned value as if it were a string (it might be undefined) if that group wasn't matched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, if match exists, match[2] must exist, since its not an optional capture group. It must be equal to the collation value, because it seems like the first capture group matches only the last instance of itself in the resulting match array (I changed the regex slightly to only capture the actual collation value):
> "en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/)
[ '-u-nu-thai-ca-calendar-so-methingelse-co-phonebk',
'-so-methingelse',
'phonebk',
index: 5,
input: 'en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk' ]
> "en-US-u-nu-thai-co-phonebk-ca-calendar".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/)
[ '-u-nu-thai-co-phonebk-ca-calendar',
'-nu-thai',
'phonebk',
index: 5,
input: 'en-US-u-nu-thai-co-phonebk-ca-calendar' ]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to remove the -co-phonebk from that string after extracting it? In that case you'll also need to optionally capture any extensions before and after.
Keep in mind that there are other singletons followed by keys like the private use -x i.e. -x-aa-aavalue. The likelihood you'd be dealing with that in this scenario is pretty minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My new version extracts "phonebk" and then will do locale = locale.replace(`-co-${match[2]}\`, ""). We may end up with lang-tag-u at the end if it was the only Unicode extension, but I believe we handle that later?
| } | ||
|
|
||
| if (collation === undefined || availableBcpTags === undefined) { return locale; } | ||
| if (collation === undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does removing availableBcpTags === undefined apply in general to ICU and WinGlob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved this check below, availableBcpTags became collationMapForLocale because I thought it was a bit clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying!
| return locale + "-u-co-" + bcpTag; | ||
| const collationMapForLocale = reverseLocaleAcceptingCollationValues[locale]; | ||
| if (collationMapForLocale === undefined) { | ||
| // Assume the system wouldnt give us back a bad collation value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: apostophe in wouldn't
…p for tracing and finalizing Create macros that allow for CBase and other types of objects to use the templated ThreadAlloc's for allocation. Use the macro for WebGLActiveInfo. Convert AddRef/Release to RootAddRef/RootRelease calls Virtual all non-public ref counting methods on CBase and abandon if they are called neuter JSBind_* methods related to refcounting Implement a Trace that simply traces the Var (which we actually know will already be marked due to how this object is reachable) Add a basic leak detector to the new type (interim, more as a sanity check) Modify callers creating this type of object to just new into a raw pointer on the stack. Call Passivate (if exists) in Dispose Ensure pointer is 16-byte aligned so that it is marked from CEO Var Patch up the IRecyclerVisitedObject interface to not take Recycler, but instead encapsulate that inside FinalizableObject Remove assert that prevented interior marking when in parallel/concurrent and recycler state not set (recycler state currently only used by MemGC) Modify assert in CBase::Passivate that expects Var to be nullptr by opting out GC traced objects
I did a dumb and forgot to make the bytecode regen its own commit. I can fix that in a bit.