-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update .NET 7 Unicode data to version 14.0.0 #44423
Comments
Tagging subscribers to this area: @tarekgh, @safern, @krwq Issue meta data
|
@GrabYourPitchforks just checkin, are you planning for doing that soon? |
Moving this to 7.0 so that the dates line up correctly. |
Now that we're within a month of Unicode 14.0's release, I gave https://unicode.org/versions/Unicode14.0.0/ another look. There's a new block Arabic Extended-B being added to the BMP. Our ingestion tools will automatically create a new API to support this block, so I opened #57609 to track the API review process for it. We're still waiting for the PDFs to be published in case there were any changes to Sec. 5.8 (which controls |
@GrabYourPitchforks what is remaining to do here? |
The Unicode Standard version 14.0.0 is tentatively scheduled for September 2021. As per usual, since the .NET runtime carries a copy of Unicode-derived data, we should update our data files to match version 14.0.0 when it's released.
This will affect the following APIs:
System.Globalization.StringInfo
System.Globalization.CharUnicodeInfo
System.Text.Encodings.Web.*
System.Text.Json.*
(since it depends onSystem.Text.Encodings.Web
)For instructions on how to update the runtime-carried Unicode data files, consult the GenUnicodeProp docs and the STEW docs. Also update the UnicodeUcdVersion data throughout our .csproj files (see samples).
See #2378 for the changes we made for Unicode 13.0.0 in .NET 5.
We should also keep an eye out for any changes to UAX#29 that might be part of the Unicode 14.0.0 wave. Our tools will automatically pick up any changes to a code point's Grapheme_Cluster_Break property, but if the algorithm in Sec. 3.1.1 changes as part of Unicode 14.0.0 then we may need to update the logic in TextSegmentationUtility.cs.
The text was updated successfully, but these errors were encountered: