Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash if ICU default locale has BCP47 extensions. Fix ures_openDirect crash with NULL locale. #110

Merged
merged 2 commits into from
Aug 11, 2021

Conversation

jefgen
Copy link
Member

@jefgen jefgen commented Aug 10, 2021

Summary

This PR fixes two issues:

  • Fix crash if ICU's default locale has BCP47 Unicode Extensions.
  • Fix ures_openDirect crash with a NULL input locale ID.

It also adds a test case for ures_openDirect with a NULL input locale ID.

This change adds a new ICU-PATCH file for the changes, though hopefully this won't be needed and we can fix this issue upstream before ingesting a new version of ICU.

PR Checklist

  • I have verified that my change is specific to this fork and cannot be made upstream.
    • Note: I've filed upstream tickets for these two issues and will be making a PR for them later.
  • I am making a maintenance related change.
  • I am making a change that is related to usage internal to Microsoft.
  • I am making a change that is related to the Windows OS build of ICU.
  • CLA signed. If not, please see here to sign the CLA.

Detailed Description

ICU-21705 Calling ures_openDirect with NULL locale ID results in crash (undefined behavior)

Calling the public API ures_openDirect will crash if you call it with a NULL Locale ID -- even though this is permitted by the API docs.

The issue is that strcmp is called with the input locale ID (ie: null pointer), which is undefined behavior, and causes the crash.

This doesn't occur with ures_open, as that calls uloc_getBaseName() first, which gets the default locale via _canonicalize(), so that the call to strcmp has a locale ID (rather than a null pointer).

We need to check the input locale ID and adjust it accordingly if it is null (default locale) or a pointer to empty string (root locale).

ICU-21706 ICU4C test suite crashes if the default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12")

If ICU's default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12") then the ICU test suite will crash.

The problem is that the resbMutex is attempted to be locked twice, which leads to the crash/termination.

The issue occurs on the first call to ures_open(). This causes the ICU data file to be loaded. However, the default locale isn't yet cached in gDefaultLocale, so Locale::getDefault needs to query the host OS for the default locale.

If the default locale has any BCP47 Unicode extension tags, this causes _canonicalize to attempt to convert them to the legacy ICU style extensions by calling uloc_forLanguageTag. As part of this conversion, uloc_toLegacyKey will also attempt to load the ICU data file, in order to load the keyTypeData to map the extensions.

However, this is problematic, as it means that we're now trying to load the data file while trying to load the data file. In other words, it means that the resbMutex will attempted to be locked twice – leading to the crash.

However, we can avoid this by moving the query for the default locale to be outside of the mutex protected part of the code – which allows us to avoid the circular nature of the issue. We can query and store the ICU default locale, and then pass it to the findFirstExisting function inside the mutex protected part of the code, so it doesn’t need to query for the default locale itself.

jefgen added 2 commits August 10, 2021 13:58
…de Extensions, and fix ures_openDirect crash with NULL locale ID.

Add test case for ures_openDirect with NULL locale ID.
@jefgen jefgen requested review from erik0686, axelandrejs, daniel-ju and a team August 10, 2021 21:32
Copy link
Contributor

@erik0686 erik0686 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Jeff!

@jefgen jefgen merged commit 9bc7ac5 into master Aug 11, 2021
@jefgen jefgen deleted the user/jefgen/fix-mutex-issues-and-ures_openDirect branch August 11, 2021 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants