Skip to content

Conversation

@ilonatommy
Copy link
Member

@ilonatommy ilonatommy commented Oct 29, 2025

This PR:

  • Merges changes from the upstream repository.
  • Removes UCARules exclusion from filter files. This change affects only non-hybrid ICU filter, hybrid filters for mobiles never excluded collation rules.
  • Refreshes the prebuilts.

Previous attempt: #340.
Runtime PR, related to the last revert of the upgrade: dotnet/runtime#93756.

Upgrade size tracking

Before

-rw-rw-rw-  1 vscode root  956416 Oct 29 12:10 icudt_CJK.dat
-rw-rw-rw-  1 vscode root 1526128 Oct 29 12:10 icudt.dat
-rw-rw-rw-  1 vscode root  550832 Oct 29 12:10 icudt_EFIGS.dat
-rw-rw-rw-  1 vscode root 1107168 Oct 29 12:10 icudt_no_CJK.dat

-rw-rw-rw-  1 vscode vscode  248840 Oct 29 12:10 icudt_CJK.dat.br
-rw-rw-rw-  1 vscode vscode  329770 Oct 29 12:10 icudt.dat.br
-rw-rw-rw-  1 vscode vscode  143983 Oct 29 12:10 icudt_EFIGS.dat.br
-rw-rw-rw-  1 vscode vscode  222250 Oct 29 12:10 icudt_no_CJK.dat.br

After

-rw-rw-rw-  1 vscode vscode 1245712 Oct 29 16:02 icudt_CJK.dat
-rw-rw-rw-  1 vscode vscode 1856624 Oct 29 15:59 icudt.dat
-rw-rw-rw-  1 vscode vscode  831504 Oct 29 16:01 icudt_EFIGS.dat
-rw-rw-rw-  1 vscode vscode 1431696 Oct 29 16:00 icudt_no_CJK.dat

-rw-rw-rw-  1 vscode vscode  301244 Oct 29 16:02 icudt_CJK.dat.br
-rw-rw-rw-  1 vscode vscode  390211 Oct 29 15:59 icudt.dat.br
-rw-rw-rw-  1 vscode vscode  195542 Oct 29 16:01 icudt_EFIGS.dat.br
-rw-rw-rw-  1 vscode vscode  279858 Oct 29 16:00 icudt_no_CJK.dat.br

erik0686 and others added 30 commits October 6, 2021 15:06
…7 tag (#112)

* port library to work with ICU

clean and update

fix leaks

remove caching and unneeded code

handle memoryu allocation error

change to uprv malloc

change method usage

change UChar to char

use CharStrings and remove unused apis

address feedback

address feedback

fix mistakes

fix mistakes

fix mistakes

fix mistakes

fix mistakes

debugging

return calendar that was failing

address feedback and add calendar

remove unneeded method

create RAII objects for wchar t

add tests for library

test only on windows

address feedback

fix casting failures

remove two check pattern and fix tests

fix tests

fix ci builds

fix ci builds

address feedback

address feedback

fix quotes I missed

use char instead of strings

add test file to objects

fix typo sigh...

add fork for previous behaviour and add test

fix name of test

Add variable to uconfig

improve test case

* address feedback

* address feedback again

* address feedback pt 3

* fix macros

* refactor macro

* address minor nits

* fix brackets
)

I'm adding a patch file to be able to keep the changes done in #112 This added the uprefs preference override library that will allow you to get the default locale as a BCP 47 tag.
data.

This changes modifies the ICU tests to pass with the modified CLDR-MS
data. The modifications are generally for the following reasons:
 - Some of data is different due to internal requirements.
 - Some are due to the extra locales that we pick up from CLDR Seed.
   These Seed locales have data quality issues, which causes the
   ICU tests to fail as they don't meet the ICU expectations.
 - Removal of the yue-* locales.
This adds special marker comments to the ICU headers in order to omit
things we don't want or can't expose in the Windows public SDK headers.
For the Windows OS build of ICU, we only have one data file
and we don't use the extended data at all. We make this function a no-op
in order to save a few cycles for perf, but more importantly so that
we don't try to load a versioned data file (ex: icudt68l.dat) after
already loading the non-versioned common data file.
Modify `make dist` paths for the MS-ICU GitHub path location, plus omit
the Doxygen docs as we don't currently build the docs.
 011-MSFT-Patch_change-tests-data-to-not-include-blocked-regions.patch
…he UCRT

This modifies the .vcxproj files for the "common" (uc) and "i18n" (in) libraries in
order to statically link the VCRuntime (libvcruntime.lib), VCStartup (libcmt.lib),
and STL/MSVCP (libcpmt.lib), but dynamically link to the UCRT (ucrt.lib).

The UCRT (C Runtime library: ucrtbase.dll) is included in Windows 10 and above as an OS
component. (See https://docs.microsoft.com/cpp/c-runtime-library/crt-library-features ).
On previous versions of Windows it is available via a Windows Update package.
(See the KB article here: https://support.microsoft.com/kb/2999226 ).

However, the vcruntime*.dll and msvcp*.dll files are only available via the
Visual C++ Redistributable (VC Redist), and the Redist must be installed manually.

This change avoids the need to install the VC Redist on Windows 10, and
previous Windows versions with the KB update for the UCRT, in order to use
the two icuuc*.dll and icuin*.dll libraries.
Modify the .vcxproj files to add the ICU major version number to the PDB
filename in order to match the DLL filenames.
Windows only searches for .exe executables by default, so we can't just
call "ant" and instead need to launch a cmd prompt to run it.

The """ changes are needed in order double-quote the arguments
that get passed to the JVM. Without this hack the arguments don't get
quoted at all, which causes build failures as some of the variables
which get replaced have embedded spaces in them.
This changes modifies the ICU tests to pass with extra locales from
CLDR-MS.
 BCP47 tag (#112)

 Currently, for many processes and tasks, ICU gets the default locale and caches it. This means that when needed, ICU will get something like "en-US" and that will not change even if you were to change your language or region in your device.
Furthermore, ICU has currently no way of getting other globalization settings such as currency, calendar, hour cycle, first day of week, sorting method and measurement system.
We have decided to add a way to solve these two problems.
By adding the uprefs library (only to the Windows implementation of uprv_getDefaultLocaleID()), we are adding the Uprefs_getBCP47Tag() internal API, which obtains a full, canonical and valid BCP47Tag containing all of the settings.

This means we also change the way we get the default locale. We go from getting only the locale and region, to getting the full thing.
* Fix CI check for version number match

* Normalize the ICU version fro all scripts to use

* Move version normalization before syncing ICU version with other scripts

* Moving sanity check on ICU version number before normalization
rp9-next and others added 17 commits July 19, 2023 12:17
Bumps [guava](https://github.com/google/guava) from 30.0-jre to 32.0.0-jre.
- [Release notes](https://github.com/google/guava/releases)
- [Commits](https://github.com/google/guava/commits)

---
updated-dependencies:
- dependency-name: com.google.guava:guava
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…ales (#134)

* Replace NNBSP with regular space for date-time formats on English locales and fix test cases
* Disable the dynamic plug-in loading and update to v72.1.0.3

* Update changelog.md

Co-authored-by: Jeff Genovy <29107334+jefgen@users.noreply.github.com>

* Update changelog.md

Co-authored-by: Jeff Genovy <29107334+jefgen@users.noreply.github.com>

---------

Co-authored-by: Jeff Genovy <29107334+jefgen@users.noreply.github.com>
* Fixed CodeQL warnings: Too few arguments for formatting function

* fix CodeQL Warnings: Comparison of narrow type with wide type in loop condition

* Fix CodeQL Warnings: Important Severity, mixed issues

* Addressed review comments

* Addressed review comments on C Style casting

* Addressed Review Comments

* Trigger CLA
* Update ms-icu to support Unicode 15.1 character data

* Update changelog and version to 72.1.0.4
@ilonatommy ilonatommy self-assigned this Oct 29, 2025
@ilonatommy ilonatommy marked this pull request as draft October 29, 2025 17:01
@ilonatommy
Copy link
Member Author

Testing with sample app revealed some issues, ICU call ulocdata_getCLDRVersion failed with error #2 'U_MISSING_RESOURCE_ERROR'., checking that.

@ilonatommy
Copy link
Member Author

Testing with sample app revealed some issues, ICU call ulocdata_getCLDRVersion failed with error #2 'U_MISSING_RESOURCE_ERROR'., checking that.

It was caused by my testing methodology, I used <BlazorIcuDataFileName>icudt_72_no_CJK.dat</BlazorIcuDataFileName>, without updating libs files (they were still on 68).

After updating the whole runtime pack contents with newly created artifacts, we still see a problem: dotnet.native.wasm binary was compiled with ICU 68 symbols hardcoded (icudt68l-*).

I tried to repackage ICU 72 date files with the p68-prefix, to match the hardcoded symbols but it's not enough - 72 version data layout is different than 68 and we're hitting unknown data format 45.6d.6f.6a ("Emoj"). It's because ICU 72 improved emoji support by introducing new data format.

@akoeplinger, we have to merge this before having it fully tested.

@ilonatommy ilonatommy marked this pull request as ready for review October 30, 2025 08:56
Copy link
Member

@akoeplinger akoeplinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from one comment, great job, thank you!

We should also get rid of icudt_hybrid.dat (it's no longer used), but that can be done in a separate PR.

@ilonatommy ilonatommy enabled auto-merge October 30, 2025 10:26
@ilonatommy ilonatommy merged commit 060989b into dotnet/main Oct 30, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants