Fix crash if ICU default locale has BCP47 extensions. Fix ures_openDirect crash with NULL locale. #110
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes two issues:
ures_openDirect
crash with a NULL input locale ID.It also adds a test case for
ures_openDirect
with a NULL input locale ID.This change adds a new ICU-PATCH file for the changes, though hopefully this won't be needed and we can fix this issue upstream before ingesting a new version of ICU.
PR Checklist
Detailed Description
ICU-21705 Calling ures_openDirect with NULL locale ID results in crash (undefined behavior)
Calling the public API
ures_openDirect
will crash if you call it with aNULL
Locale ID -- even though this is permitted by the API docs.The issue is that
strcmp
is called with the input locale ID (ie: null pointer), which is undefined behavior, and causes the crash.This doesn't occur with
ures_open
, as that callsuloc_getBaseName()
first, which gets the default locale via_canonicalize()
, so that the call tostrcmp
has a locale ID (rather than a null pointer).We need to check the input locale ID and adjust it accordingly if it is null (default locale) or a pointer to empty string (root locale).
ICU-21706 ICU4C test suite crashes if the default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12")
If ICU's default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12") then the ICU test suite will crash.
The problem is that the
resbMutex
is attempted to be locked twice, which leads to the crash/termination.The issue occurs on the first call to
ures_open()
. This causes the ICU data file to be loaded. However, the default locale isn't yet cached ingDefaultLocale
, soLocale::getDefault
needs to query the host OS for the default locale.If the default locale has any BCP47 Unicode extension tags, this causes
_canonicalize
to attempt to convert them to the legacy ICU style extensions by callinguloc_forLanguageTag
. As part of this conversion,uloc_toLegacyKey
will also attempt to load the ICU data file, in order to load thekeyTypeData
to map the extensions.However, this is problematic, as it means that we're now trying to load the data file while trying to load the data file. In other words, it means that the
resbMutex
will attempted to be locked twice – leading to the crash.However, we can avoid this by moving the query for the default locale to be outside of the mutex protected part of the code – which allows us to avoid the circular nature of the issue. We can query and store the ICU default locale, and then pass it to the
findFirstExisting
function inside the mutex protected part of the code, so it doesn’t need to query for the default locale itself.