WIP: getDefaultLocale and getDefaultTimeZone #1

jackhorton · 2017-09-23T04:34:05Z

I did a dumb and forgot to make the bytecode regen its own commit. I can fix that in a bit.

dilijev

See comments on individual commits. Looks like you're on the right track!

dilijev · 2017-09-27T06:46:02Z

lib/Runtime/Base/WindowsGlobalizationAdapter.cpp

-        const auto locale = _u("en-US");
-        const size_t len = 5;
-        if (lpLocaleName)
+        UErrorCode error = U_ZERO_ERROR;


Use UErrorCode::U_ZERO_ERROR to be explicit, and consistent with other places (I've made this consistent in my PR as well).

Wouldn't that be using a C++ enum class? Are we still going forward with adding more C++ until we resolve the system ICU issue?

Doesn't require an enum class but does require a C++ compiler (which we do). In C++ all enum values are namespace-scoped to their enclosing enum.

Side note: I'm not sure we can resolve the system ICU issue (might not be possible and we just have to build it in any case) but for now we rely on the C++ APIs.

In summary IMO we should still be consistent and use the scope.

dilijev · 2017-09-27T06:46:22Z

lib/Runtime/Base/WindowsGlobalizationAdapter.cpp

+        int32_t written = uloc_toLanguageTag(uloc_getDefault(), bcp47, ULOC_FULLNAME_CAPACITY, true, &error);
+
+        // REVIEW (jahorto): if getting the language tag for NULL (default) fails, is there any backup?
+        if (!U_SUCCESS(error) || written <= 0)


Use U_FAILURE

dilijev · 2017-09-27T06:47:32Z

lib/Runtime/Library/InJavascript/Intl.js

+            locale: resolved,
            subTags: subTags,
-            localeWithoutSubtags: resolved
+            localeWithoutSubtags: withoutSubTags


Can you explain this change?

Since I moved the resolved logic out into its own if/else block, resolved here includes the subtags. withoutSubTags is just whatever resolved was before the if block runs.

Gotcha. Thanks.

dilijev · 2017-09-27T06:49:53Z

lib/Runtime/Library/InJavascript/Intl.js

+    // making these vars in JS allows us to more change how we
+    // determine the backing library
+    let isICU = !platform.winglob;
+    let isWinGlob = platform.winglob;


nitpick (Don't change anything unless you've already got another change that will cause you to regenerate bytecode anyway)

Maybe be more explicit with these names since they will be unscoped in Intl.js (with anything like platform.isICU). Maybe platform.winglob was unfortunate naming for a boolean as well, but don't update that.

I'd suggest isPlatformUsingICU or something.

dilijev · 2017-09-27T06:52:54Z

lib/Runtime/Library/InJavascript/Intl.js

+            locale = platform.getDefaultLocale();
+            const match = platform.builtInRegexMatch(locale, /-u(-[^\-][^\-]?-[^\-]+)*(-co-[^\-]+).*/);
+            if (match) {
+                collation = callInstanceFunc(StringInstanceReplace, match[2], "-co-", "");


Rather than doing another string operation, write the regex so that it captures the part you want, like this: (-co-([^\-]+)).

Also what if you got a match but there was no matching group where you expected? -co- is forced to match, as well as the value, and we're assuming a well-formed locale from a system API, but you never know. Food for thought.

If you end up finding that there might a problem here: maybe add an extra check that the index is on the object before using the returned value as if it were a string (it might be undefined) if that group wasn't matched.

so, if match exists, match[2] must exist, since its not an optional capture group. It must be equal to the collation value, because it seems like the first capture group matches only the last instance of itself in the resulting match array (I changed the regex slightly to only capture the actual collation value):

> "en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/) [ '-u-nu-thai-ca-calendar-so-methingelse-co-phonebk', '-so-methingelse', 'phonebk', index: 5, input: 'en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk' ] > "en-US-u-nu-thai-co-phonebk-ca-calendar".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/) [ '-u-nu-thai-co-phonebk-ca-calendar', '-nu-thai', 'phonebk', index: 5, input: 'en-US-u-nu-thai-co-phonebk-ca-calendar' ]

Do you want to remove the -co-phonebk from that string after extracting it? In that case you'll also need to optionally capture any extensions before and after.

Keep in mind that there are other singletons followed by keys like the private use -x i.e. -x-aa-aavalue. The likelihood you'd be dealing with that in this scenario is pretty minimal.

My new version extracts "phonebk" and then will do locale = locale.replace(`-co-${match[2]}\`, ""). We may end up with lang-tag-u at the end if it was the only Unicode extension, but I believe we handle that later?

dilijev · 2017-09-27T06:56:23Z

lib/Runtime/Library/InJavascript/Intl.js

+        }

-        if (collation === undefined || availableBcpTags === undefined) { return locale; }
+        if (collation === undefined) {


Does removing availableBcpTags === undefined apply in general to ICU and WinGlob?

I just moved this check below, availableBcpTags became collationMapForLocale because I thought it was a bit clearer.

Thanks for clarifying!

dilijev · 2017-09-27T06:56:38Z

lib/Runtime/Library/InJavascript/Intl.js

-            return locale + "-u-co-" + bcpTag;
+        const collationMapForLocale = reverseLocaleAcceptingCollationValues[locale];
+        if (collationMapForLocale === undefined) {
+            // Assume the system wouldnt give us back a bad collation value


nit: apostophe in wouldn't

…p for tracing and finalizing Create macros that allow for CBase and other types of objects to use the templated ThreadAlloc's for allocation. Use the macro for WebGLActiveInfo. Convert AddRef/Release to RootAddRef/RootRelease calls Virtual all non-public ref counting methods on CBase and abandon if they are called neuter JSBind_* methods related to refcounting Implement a Trace that simply traces the Var (which we actually know will already be marked due to how this object is reachable) Add a basic leak detector to the new type (interim, more as a sanity check) Modify callers creating this type of object to just new into a raw pointer on the stack. Call Passivate (if exists) in Dispose Ensure pointer is 16-byte aligned so that it is marked from CEO Var Patch up the IRecyclerVisitedObject interface to not take Recycler, but instead encapsulate that inside FinalizableObject Remove assert that prevented interior marking when in parallel/concurrent and recycler state not set (recycler state currently only used by MemGC) Modify assert in CBase::Passivate that expects Var to be nullptr by opting out GC traced objects

Initial stab at ICU implementation of getDefaultLocale

74f68b7

dilijev reviewed Sep 23, 2017

View reviewed changes

JS implementation of getDefaultLocale

f28deb8

jackhorton force-pushed the icu-defaults branch from 85be649 to f28deb8 Compare September 25, 2017 16:12

jackhorton added 3 commits September 26, 2017 22:54

Full ICU defaultLocale support

f260ccf

Update bytecode

d9de6fc

Remove hard-coded default

46112c9

jackhorton closed this Sep 27, 2017

dilijev reviewed Sep 27, 2017

View reviewed changes

WIP: getDefaultLocale and getDefaultTimeZone #1

WIP: getDefaultLocale and getDefaultTimeZone #1

Uh oh!

Conversation

jackhorton commented Sep 23, 2017

Uh oh!

dilijev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dilijev Sep 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jackhorton Sep 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dilijev Sep 27, 2017 •

edited

Loading

jackhorton Sep 27, 2017 •

edited

Loading