Skip to content

Conversation

@jackhorton
Copy link
Owner

I did a dumb and forgot to make the bytecode regen its own commit. I can fix that in a bit.

Copy link

@dilijev dilijev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments on individual commits. Looks like you're on the right track!

@jackhorton jackhorton closed this Sep 27, 2017
const auto locale = _u("en-US");
const size_t len = 5;
if (lpLocaleName)
UErrorCode error = U_ZERO_ERROR;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use UErrorCode::U_ZERO_ERROR to be explicit, and consistent with other places (I've made this consistent in my PR as well).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that be using a C++ enum class? Are we still going forward with adding more C++ until we resolve the system ICU issue?

Copy link

@dilijev dilijev Sep 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't require an enum class but does require a C++ compiler (which we do). In C++ all enum values are namespace-scoped to their enclosing enum.

Side note: I'm not sure we can resolve the system ICU issue (might not be possible and we just have to build it in any case) but for now we rely on the C++ APIs.

In summary IMO we should still be consistent and use the scope.

int32_t written = uloc_toLanguageTag(uloc_getDefault(), bcp47, ULOC_FULLNAME_CAPACITY, true, &error);

// REVIEW (jahorto): if getting the language tag for NULL (default) fails, is there any backup?
if (!U_SUCCESS(error) || written <= 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use U_FAILURE

locale: resolved,
subTags: subTags,
localeWithoutSubtags: resolved
localeWithoutSubtags: withoutSubTags
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I moved the resolved logic out into its own if/else block, resolved here includes the subtags. withoutSubTags is just whatever resolved was before the if block runs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Thanks.

// making these vars in JS allows us to more change how we
// determine the backing library
let isICU = !platform.winglob;
let isWinGlob = platform.winglob;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (Don't change anything unless you've already got another change that will cause you to regenerate bytecode anyway)

Maybe be more explicit with these names since they will be unscoped in Intl.js (with anything like platform.isICU). Maybe platform.winglob was unfortunate naming for a boolean as well, but don't update that.

I'd suggest isPlatformUsingICU or something.

locale = platform.getDefaultLocale();
const match = platform.builtInRegexMatch(locale, /-u(-[^\-][^\-]?-[^\-]+)*(-co-[^\-]+).*/);
if (match) {
collation = callInstanceFunc(StringInstanceReplace, match[2], "-co-", "");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing another string operation, write the regex so that it captures the part you want, like this: (-co-([^\-]+)).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what if you got a match but there was no matching group where you expected? -co- is forced to match, as well as the value, and we're assuming a well-formed locale from a system API, but you never know. Food for thought.

If you end up finding that there might a problem here: maybe add an extra check that the index is on the object before using the returned value as if it were a string (it might be undefined) if that group wasn't matched.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, if match exists, match[2] must exist, since its not an optional capture group. It must be equal to the collation value, because it seems like the first capture group matches only the last instance of itself in the resulting match array (I changed the regex slightly to only capture the actual collation value):

> "en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/)
[ '-u-nu-thai-ca-calendar-so-methingelse-co-phonebk',
  '-so-methingelse',
  'phonebk',
  index: 5,
  input: 'en-US-u-co-thai-ca-calendar-so-methingelse-co-phonebk' ]
> "en-US-u-nu-thai-co-phonebk-ca-calendar".match(/-u(-[^\-][^\-]?-[^\-]+)*-co-([^\-]+).*/)
[ '-u-nu-thai-co-phonebk-ca-calendar',
  '-nu-thai',
  'phonebk',
  index: 5,
  input: 'en-US-u-nu-thai-co-phonebk-ca-calendar' ]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove the -co-phonebk from that string after extracting it? In that case you'll also need to optionally capture any extensions before and after.

Keep in mind that there are other singletons followed by keys like the private use -x i.e. -x-aa-aavalue. The likelihood you'd be dealing with that in this scenario is pretty minimal.

Copy link
Owner Author

@jackhorton jackhorton Sep 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My new version extracts "phonebk" and then will do locale = locale.replace(`-co-${match[2]}\`, ""). We may end up with lang-tag-u at the end if it was the only Unicode extension, but I believe we handle that later?

}

if (collation === undefined || availableBcpTags === undefined) { return locale; }
if (collation === undefined) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does removing availableBcpTags === undefined apply in general to ICU and WinGlob?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this check below, availableBcpTags became collationMapForLocale because I thought it was a bit clearer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

return locale + "-u-co-" + bcpTag;
const collationMapForLocale = reverseLocaleAcceptingCollationValues[locale];
if (collationMapForLocale === undefined) {
// Assume the system wouldnt give us back a bad collation value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: apostophe in wouldn't

jackhorton pushed a commit that referenced this pull request Oct 16, 2017
…p for tracing and finalizing

Create macros that allow for CBase and other types of objects to use the templated ThreadAlloc's for allocation.
Use the macro for WebGLActiveInfo.
Convert AddRef/Release to RootAddRef/RootRelease calls
Virtual all non-public ref counting methods on CBase and abandon if they are called
neuter JSBind_* methods related to refcounting
Implement a Trace that simply traces the Var (which we actually know will already be marked due to how this object is reachable)
Add a basic leak detector to the new type (interim, more as a sanity check)
Modify callers creating this type of object to just new into a raw pointer on the stack.

Call Passivate (if exists) in Dispose
Ensure pointer is 16-byte aligned so that it is marked from CEO Var

Patch up the IRecyclerVisitedObject interface to not take Recycler, but instead encapsulate that inside FinalizableObject
Remove assert that prevented interior marking when in parallel/concurrent and recycler state not set (recycler state currently only used by MemGC)

Modify assert in CBase::Passivate that expects Var to be nullptr by opting out GC traced objects
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants