Skip to content

ヨリ #1027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

ヨリ #1027

Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt

Check warning on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

GitHub Actions / Draft unless approved

Not in the 17.0 pipeline

While the Unicode Technical Committee has provisionally assigned these characters, they have not been accepted for Unicode 17.0, nor for any specific version of Unicode. The Age property values for new characters are likely incorrect right now. They will be recomputed after the UTC accepts their encoding and this pull request is updated for the target version.
# Date: 2025-01-27, 18:09:08 GMT
# Date: 2025-01-30, 20:54:04 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2095,6 +2095,7 @@
187F8..187FF ; 17.0 # [8] TANGUT IDEOGRAPH-187F8..TANGUT IDEOGRAPH-187FF
18D09..18D1E ; 17.0 # [22] TANGUT IDEOGRAPH-18D09..TANGUT IDEOGRAPH-18D1E
18D80..18DF2 ; 17.0 # [115] TANGUT COMPONENT-769..TANGUT COMPONENT-883
1B126 ; 17.0 # KATAKANA DIGRAPH YORI
1CCFA..1CCFC ; 17.0 # [3] SNAKE SYMBOL..NOSE SYMBOL
1CEBA..1CED0 ; 17.0 # [23] FRAGILE SYMBOL..LEUKOTHEA
1CEE0..1CEF0 ; 17.0 # [17] GEOMANTIC FIGURE POPULUS..MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR
@@ -2116,6 +2117,6 @@
2B73A..2B73E ; 17.0 # [5] CJK UNIFIED IDEOGRAPH-2B73A..CJK UNIFIED IDEOGRAPH-2B73E
323B0..33479 ; 17.0 # [4298] CJK UNIFIED IDEOGRAPH-323B0..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 4836
# Total code points: 4837

# EOF
20 changes: 13 additions & 7 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-17.0.0.txt
# Date: 2025-01-27, 18:09:11 GMT
# Date: 2025-01-30, 20:54:25 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1346,6 +1346,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
1AFF5..1AFFB ; Alphabetic # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; Alphabetic # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; Alphabetic # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; Alphabetic # Lo KATAKANA DIGRAPH YORI
1B132 ; Alphabetic # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; Alphabetic # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; Alphabetic # Lo KATAKANA LETTER SMALL KO
@@ -1471,7 +1472,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
30000..3134A ; Alphabetic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 147441
# Total code points: 147442

# ================================================

@@ -6934,6 +6935,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
1AFF5..1AFFB ; ID_Start # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; ID_Start # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; ID_Start # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; ID_Start # Lo KATAKANA DIGRAPH YORI
1B132 ; ID_Start # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; ID_Start # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; ID_Start # Lo KATAKANA LETTER SMALL KO
@@ -7044,7 +7046,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
30000..3134A ; ID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 145935
# Total code points: 145936

# ================================================

@@ -8332,6 +8334,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
1AFF5..1AFFB ; ID_Continue # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; ID_Continue # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; ID_Continue # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; ID_Continue # Lo KATAKANA DIGRAPH YORI
1B132 ; ID_Continue # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; ID_Continue # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; ID_Continue # Lo KATAKANA LETTER SMALL KO
@@ -8484,7 +8487,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
31350..33479 ; ID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 149273
# Total code points: 149274

# ================================================

@@ -9169,6 +9172,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
1AFF5..1AFFB ; XID_Start # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; XID_Start # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; XID_Start # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; XID_Start # Lo KATAKANA DIGRAPH YORI
1B132 ; XID_Start # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; XID_Start # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; XID_Start # Lo KATAKANA LETTER SMALL KO
@@ -9279,7 +9283,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
30000..3134A ; XID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 145912
# Total code points: 145913

# ================================================

@@ -10568,6 +10572,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1AFF5..1AFFB ; XID_Continue # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; XID_Continue # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; XID_Continue # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; XID_Continue # Lo KATAKANA DIGRAPH YORI
1B132 ; XID_Continue # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; XID_Continue # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; XID_Continue # Lo KATAKANA LETTER SMALL KO
@@ -10720,7 +10725,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
31350..33479 ; XID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 149254
# Total code points: 149255

# ================================================

@@ -12805,6 +12810,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1AFF5..1AFFB ; Grapheme_Base # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; Grapheme_Base # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B122 ; Grapheme_Base # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1B126 ; Grapheme_Base # Lo KATAKANA DIGRAPH YORI
1B132 ; Grapheme_Base # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; Grapheme_Base # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; Grapheme_Base # Lo KATAKANA LETTER SMALL KO
@@ -13016,7 +13022,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
30000..3134A ; Grapheme_Base # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 157523
# Total code points: 157524

# ================================================

23 changes: 15 additions & 8 deletions unicodetools/data/ucd/dev/DerivedNormalizationProps.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedNormalizationProps-17.0.0.txt
# Date: 2025-01-27, 18:09:14 GMT
# Date: 2025-01-30, 20:54:29 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1663,6 +1663,7 @@ FFED..FFEE ; NFKD_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
11938 ; NFKD_QC; N # Mc DIVES AKURU VOWEL SIGN O
16121..16128 ; NFKD_QC; N # Mn [8] GURUNG KHEMA VOWEL SIGN U..GURUNG KHEMA VOWEL SIGN AU
16D68..16D6A ; NFKD_QC; N # Lo [3] KIRAT RAI VOWEL SIGN AI..KIRAT RAI VOWEL SIGN AU
1B126 ; NFKD_QC; N # Lo KATAKANA DIGRAPH YORI
1CCD6..1CCEF ; NFKD_QC; N # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
1CCF0..1CCF9 ; NFKD_QC; N # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
1D15E..1D164 ; NFKD_QC; N # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
@@ -1754,7 +1755,7 @@ FFED..FFEE ; NFKD_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
1FBF0..1FBF9 ; NFKD_QC; N # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
2F800..2FA1D ; NFKD_QC; N # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D

# Total code points: 17086
# Total code points: 17087

# ================================================

@@ -2074,6 +2075,7 @@ FFED..FFEE ; NFKC_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
10781..10785 ; NFKC_QC; N # Lm [5] MODIFIER LETTER SUPERSCRIPT TRIANGULAR COLON..MODIFIER LETTER SMALL B WITH HOOK
10787..107B0 ; NFKC_QC; N # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
107B2..107BA ; NFKC_QC; N # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL
1B126 ; NFKC_QC; N # Lo KATAKANA DIGRAPH YORI
1CCD6..1CCEF ; NFKC_QC; N # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
1CCF0..1CCF9 ; NFKC_QC; N # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
1D15E..1D164 ; NFKC_QC; N # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
@@ -2165,7 +2167,7 @@ FFED..FFEE ; NFKC_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
1FBF0..1FBF9 ; NFKC_QC; N # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
2F800..2FA1D ; NFKC_QC; N # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D

# Total code points: 4965
# Total code points: 4966

# ================================================

@@ -2827,6 +2829,7 @@ FFE3 ; Expands_On_NFKD # Sk FULLWIDTH MACRON
11938 ; Expands_On_NFKD # Mc DIVES AKURU VOWEL SIGN O
16121..16128 ; Expands_On_NFKD # Mn [8] GURUNG KHEMA VOWEL SIGN U..GURUNG KHEMA VOWEL SIGN AU
16D68..16D6A ; Expands_On_NFKD # Lo [3] KIRAT RAI VOWEL SIGN AI..KIRAT RAI VOWEL SIGN AU
1B126 ; Expands_On_NFKD # Lo KATAKANA DIGRAPH YORI
1D15E..1D164 ; Expands_On_NFKD # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
1D1BB..1D1C0 ; Expands_On_NFKD # So [6] MUSICAL SYMBOL MINIMA..MUSICAL SYMBOL FUSA BLACK
1F100..1F10A ; Expands_On_NFKD # No [11] DIGIT ZERO FULL STOP..DIGIT NINE COMMA
@@ -2839,7 +2842,7 @@ FFE3 ; Expands_On_NFKD # Sk FULLWIDTH MACRON
1F213 ; Expands_On_NFKD # So SQUARED KATAKANA DE
1F240..1F248 ; Expands_On_NFKD # So [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557

# Total code points: 13410
# Total code points: 13411

# ================================================

@@ -2966,6 +2969,7 @@ FE74 ; Expands_On_NFKC # Lo ARABIC KASRATAN ISOLATED FORM
FE76..FE7F ; Expands_On_NFKC # Lo [10] ARABIC FATHA ISOLATED FORM..ARABIC SUKUN MEDIAL FORM
FEF5..FEFC ; Expands_On_NFKC # Lo [8] ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM..ARABIC LIGATURE LAM WITH ALEF FINAL FORM
FFE3 ; Expands_On_NFKC # Sk FULLWIDTH MACRON
1B126 ; Expands_On_NFKC # Lo KATAKANA DIGRAPH YORI
1D15E..1D164 ; Expands_On_NFKC # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
1D1BB..1D1C0 ; Expands_On_NFKC # So [6] MUSICAL SYMBOL MINIMA..MUSICAL SYMBOL FUSA BLACK
1F100..1F10A ; Expands_On_NFKC # No [11] DIGIT ZERO FULL STOP..DIGIT NINE COMMA
@@ -2977,7 +2981,7 @@ FFE3 ; Expands_On_NFKC # Sk FULLWIDTH MACRON
1F200..1F201 ; Expands_On_NFKC # So [2] SQUARE HIRAGANA HOKA..SQUARED KATAKANA KOKO
1F240..1F248 ; Expands_On_NFKC # So [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557

# Total code points: 1237
# Total code points: 1238

# ================================================

@@ -7214,6 +7218,7 @@ FFF0..FFF8 ; NFKC_CF; # Cn [9] <reserved-FFF0>..<reserved-FF
16EB6 ; NFKC_CF; 16ED1 # L& BERIA ERFE CAPITAL LETTER UI
16EB7 ; NFKC_CF; 16ED2 # L& BERIA ERFE CAPITAL LETTER WASSE
16EB8 ; NFKC_CF; 16ED3 # L& BERIA ERFE CAPITAL LETTER AY
1B126 ; NFKC_CF; 30E8 30EA # Lo KATAKANA DIGRAPH YORI
1BCA0..1BCA3 ; NFKC_CF; # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1CCD6 ; NFKC_CF; 0061 # So OUTLINED LATIN CAPITAL LETTER A
1CCD7 ; NFKC_CF; 0062 # So OUTLINED LATIN CAPITAL LETTER B
@@ -9178,7 +9183,7 @@ E0080..E00FF ; NFKC_CF; # Cn [128] <reserved-E0080>..<reserved-E
E0100..E01EF ; NFKC_CF; # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
E01F0..E0FFF ; NFKC_CF; # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>

# Total code points: 10583
# Total code points: 10584

# ================================================

@@ -13377,6 +13382,7 @@ FFF0..FFF8 ; NFKC_SCF; # Cn [9] <reserved-FFF0>..<reserved-F
16EB6 ; NFKC_SCF; 16ED1 # L& BERIA ERFE CAPITAL LETTER UI
16EB7 ; NFKC_SCF; 16ED2 # L& BERIA ERFE CAPITAL LETTER WASSE
16EB8 ; NFKC_SCF; 16ED3 # L& BERIA ERFE CAPITAL LETTER AY
1B126 ; NFKC_SCF; 30E8 30EA # Lo KATAKANA DIGRAPH YORI
1BCA0..1BCA3 ; NFKC_SCF; # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1CCD6 ; NFKC_SCF; 0061 # So OUTLINED LATIN CAPITAL LETTER A
1CCD7 ; NFKC_SCF; 0062 # So OUTLINED LATIN CAPITAL LETTER B
@@ -15341,7 +15347,7 @@ E0080..E00FF ; NFKC_SCF; # Cn [128] <reserved-E0080>..<reserved-
E0100..E01EF ; NFKC_SCF; # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
E01F0..E0FFF ; NFKC_SCF; # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>

# Total code points: 10545
# Total code points: 10546

# ================================================

@@ -16262,6 +16268,7 @@ FFF0..FFF8 ; Changes_When_NFKC_Casefolded # Cn [9] <reserved-FFF0>..<reserv
118A0..118BF ; Changes_When_NFKC_Casefolded # L& [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO
16E40..16E5F ; Changes_When_NFKC_Casefolded # L& [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y
16EA0..16EB8 ; Changes_When_NFKC_Casefolded # L& [25] BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY
1B126 ; Changes_When_NFKC_Casefolded # Lo KATAKANA DIGRAPH YORI
1BCA0..1BCA3 ; Changes_When_NFKC_Casefolded # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1CCD6..1CCEF ; Changes_When_NFKC_Casefolded # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
1CCF0..1CCF9 ; Changes_When_NFKC_Casefolded # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
@@ -16363,6 +16370,6 @@ E0080..E00FF ; Changes_When_NFKC_Casefolded # Cn [128] <reserved-E0080>..<reser
E0100..E01EF ; Changes_When_NFKC_Casefolded # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
E01F0..E0FFF ; Changes_When_NFKC_Casefolded # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>

# Total code points: 10583
# Total code points: 10584

# EOF
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-17.0.0.txt
# Date: 2025-01-27, 18:09:15 GMT
# Date: 2025-01-30, 20:54:30 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2395,6 +2395,7 @@ FFFD ; A # So REPLACEMENT CHARACTER
1AFFD..1AFFE ; W # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B0FF ; W # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2
1B100..1B122 ; W # Lo [35] HENTAIGANA LETTER RE-3..KATAKANA LETTER ARCHAIC WU
1B126 ; W # Lo KATAKANA DIGRAPH YORI
1B132 ; W # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; W # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; W # Lo KATAKANA LETTER SMALL KO
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-17.0.0.txt
# Date: 2025-01-27, 18:09:16 GMT
# Date: 2025-01-30, 20:54:33 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -3306,6 +3306,7 @@ FFFD ; AI # So REPLACEMENT CHARACTER
1AFFD..1AFFE ; AL # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B0FF ; ID # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2
1B100..1B122 ; ID # Lo [35] HENTAIGANA LETTER RE-3..KATAKANA LETTER ARCHAIC WU
1B126 ; ID # Lo KATAKANA DIGRAPH YORI
1B132 ; CJ # Lo HIRAGANA LETTER SMALL KO
1B150..1B152 ; CJ # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B155 ; CJ # Lo KATAKANA LETTER SMALL KO
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/NormalizationTest.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# NormalizationTest-17.0.0.txt
# Date: 2025-01-27, 18:09:23 GMT
# Date: 2025-01-30, 20:54:40 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -15240,6 +15240,7 @@ FFEE;FFEE;FFEE;25CB;25CB; # (○; ○; ○; ○; ○; ) HALFWIDTH WHITE CIRCLE
16D68;16D68;16D67 16D67;16D68;16D67 16D67; # (𖵨; 𖵨; 𖵨; 𖵨; 𖵨; ) KIRAT RAI VOWEL SIGN AI
16D69;16D69;16D63 16D67;16D69;16D63 16D67; # (𖵩; 𖵩; 𖵩; 𖵩; 𖵩; ) KIRAT RAI VOWEL SIGN O
16D6A;16D6A;16D63 16D67 16D67;16D6A;16D63 16D67 16D67; # (𖵪; 𖵪; 𖵪; 𖵪; 𖵪; ) KIRAT RAI VOWEL SIGN AU
1B126;1B126;1B126;30E8 30EA;30E8 30EA; # (𛄦; 𛄦; 𛄦; ヨリ; ヨリ; ) KATAKANA DIGRAPH YORI
1CCD6;1CCD6;1CCD6;0041;0041; # (𜳖; 𜳖; 𜳖; A; A; ) OUTLINED LATIN CAPITAL LETTER A
1CCD7;1CCD7;1CCD7;0042;0042; # (𜳗; 𜳗; 𜳗; B; B; ) OUTLINED LATIN CAPITAL LETTER B
1CCD8;1CCD8;1CCD8;0043;0043; # (𜳘; 𜳘; 𜳘; C; C; ) OUTLINED LATIN CAPITAL LETTER C
Loading

Unchanged files with check annotations Beta

# https://www.unicode.org/reports/tr39/#Identifier_Status_and_Type
# “Unassigned characters, private use characters, surrogates, non-whitespace control characters.”
\p{Identifier_Type=Not_Character} = [\p{gc=Cn}\p{gc=Co}\p{gc=Cs}\p{gc=Cc}-\p{White_Space}]

Check notice on line 9 in unicodetools/src/main/resources/org/unicode/text/UCD/SecurityInvariantTest.txt

GitHub Actions / Check security data invariants

Invariant test failure

Expected empty, got: 4837 [\u088F\u09FF\u0B53\u0B54\u0C5C\u0CDC\u1ACF-\u1ADD\u1AE0-\u1AEB\u2B96\uA7CE\uA7CF\uA7D2\uA7D4\uA7F1\uFBC3-\uFBD2\uFD90\uFD91\uFDC8-\uFDCE\U00010940-\U0001095C\U00010EC5-\U00010EC7\U00010ED0-\U00010ED8\U00010EFA\U00010EFB\U00011B60-\U00011B67\U00011DB0-\U00011DDB\U00011DE0-\U00011DE9\U00016D80-\U00016D9D\U00016DA0-\U00016DA9\U00016EA0-\U00016EB8\U00016EBB-\U00016ED3\U00016FF2-\U00016FF6\U000187F8-\U000187FF\U00018D09-\U00018D1E\U00018D80-\U00018DF2\U0001B126\U0001CCFA-\U0001CCFC\U0001CEBA-\U0001CED0\U0001CEE0-\U0001CEF0\U0001E6C0-\U0001E6DE\U0001E6E0-\U0001E6F5\U0001E6FE\U0001E6FF\U0001F6D8\U0001F777-\U0001F77A\U0001F8D0-\U0001F8D8\U0001FA54-\U0001FA57\U0001FA8A\U0001FA8E\U0001FAC8\U0001FACD\U0001FADD\U0001FAEA\U0001FAEF\U0001FBFA\U0002B73A-\U0002B73E\U000323B0-\U00033479] In \p{Identifier_Type=Not_Character} But Not In [\p{gc=Cn}\p{gc=Co}\p{gc=Cs}\p{gc=Cc}-\p{White_Space}] 088F # (�) ARABIC LETTER NOON WITH RING ABOVE 09FF # (�) BENGALI LETTER SANSKRIT BA 0B53..0B54 # [2] (�..�) ORIYA SIGN DOT ABOVE..ORIYA SIGN DOUBLE DOT ABOVE 0C5C # (�) TELUGU ARCHAIC SHRII 0CDC # (�) KANNADA ARCHAIC SHRII 1ACF..1ADD # [15] (�..�) COMBINING DOUBLE CARON..COMBINING DOT-AND-RING BELOW 1AE0..1AEB # [12] (�..�) COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE 2B96 # (�) EQUALS SIGN WITH INFINITY ABOVE A7CE..A7CF # [2] (�..�) LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE..LATIN SMALL LETTER PHARYNGEAL VOICED FRICATIVE A7D2 # (�) LATIN CAPITAL LETTER DOUBLE THORN A7D4 # (�) LATIN CAPITAL LETTER DOUBLE WYNN A7F1 # (�) MODIFIER LETTER CAPITAL S FBC3..FBD2 # [16] (�..�) ARABIC LIGATURE JALLA WA-ALAA..ARABIC LIGATURE ALAYHI AR-RAHMAH FD90..FD91 # [2] (�..�) ARABIC LIGATURE RAHMATU ALLAAHI ALAYH..ARABIC LIGATURE RAHMATU ALLAAHI ALAYHAA FDC8..FDCE # [7] (�..�) ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIGATURE KARRAMA ALLAAHU WAJHAH 10940..1095C # [29] (�..�) SIDETIC LETTER N01..SIDETIC LETTER N29 10EC5..10EC7 # [3] (�..�) ARABIC SMALL YEH BARREE WITH TWO DOTS BELOW..ARABIC LETTER YEH WITH FOUR DOTS BELOW 10ED0..10ED8 # [9] (�..�) ARABIC BIBLICAL END OF VERSE..ARABIC LIGATURE NAWWARA ALLAAHU MARQADAH 10EFA..10EFB # [2] (�..�) ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW NOON 11B60..11B67 # [8] (�..�) SHARADA VOWEL SIGN OE..SHARADA VOWEL SIGN CANDRA O 11DB0..11DDB # [44] (�..�) TOLONG SIKI LETTER I..TOLONG SIKI UNGGA 11DE0..11DE9 # [10] (�..�) TOLONG SIKI DIGIT ZERO..TOLONG SIKI DIGIT NINE 16D80..16D9D # [30] (�..�) CHISOI LETTER A..CHISOI SIGN SISO 16DA0..16DA9 # [10] (�..�) CHISOI DIGIT ZERO..CHISOI DIGIT NINE 16EA0..16EB8 # [25] (�..�) BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY 16EBB..16ED3 # [25] (�..�) BERIA ERFE SMALL LETTER ARKAB..BERIA ERFE SMALL LETTER AY 16FF2..16FF6 # [5] (�..�) CHINESE SMALL SIMPLIFIED ER..YANGQIN SIGN SLOW TWO BEATS 187F8..187FF # [8] (�..�) TANGUT IDEOGRAPH-187F8..TANGUT IDEOGRAPH-187FF 18D09..18D1E # [22] (�..�) TANGUT IDEOGRAPH-18D09..TANGUT IDEOGRAPH-18D1E 18D80..18DF2 # [115] (�..�) TANGUT COMPONENT-769..TANGUT COMPONENT-883 1B126 # (�) KATAKANA DIGRAPH YORI 1CCFA..1CCFC # [3] (�..�) SNAKE SYMBOL..NOSE SYMBOL 1CEBA..1CED0 # [23] (�..�) FRAGILE SYMBOL..LEUKOTHEA 1CEE0..1CEF0 # [17] (�..�) GEOMANTIC FIGURE POPULUS..MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR 1E6C0..1E6DE # [31] (�..�) TAI YO LETTER LOW KO..TAI YO LETTER HIGH KVO 1E6E0..1E6F5 # [22] (�..�) TAI YO LETTER AA..TAI YO SIGN OM 1E6FE..1E6FF # [2] (�..�) TAI YO SYMBOL MUEANG..TAI YO XAM LAI 1F6D8 # (�) LANDSLIDE 1F777..1F77A # [4] (�..�) VESTA FORM TWO..PARTHENOPE FORM TWO 1F8D0..1F8D8 # [9] (�..�) LONG RIGHTWARDS ARROW OVER LONG LEFTWARDS ARROW..LONG LEFT RIGHT ARROW WITH DEPENDENT LOBE 1FA54..1FA57 # [4] (�..�) WHITE CHESS FERZ..BLACK CHESS ALFIL 1FA8A # (�) TROMBONE 1FA8E
# “Multiple values are not assigned to characters with strong restrictions:
# Not_Character, Deprecated, Default_Ignorable, Not_NFKC.”
# For example, Default_Ignorable is trumped by unassigned and Deprecated.
\p{Identifier_Type=Default_Ignorable} = [\p{Default_Ignorable_Code_Point}-\p{gc=Cn}-\p{Deprecated}]
\p{Identifier_Type=Not_NFKC} = [\p{NFKC_QC=No}-\p{Deprecated}-\p{Default_Ignorable_Code_Point}]

Check notice on line 19 in unicodetools/src/main/resources/org/unicode/text/UCD/SecurityInvariantTest.txt

GitHub Actions / Check security data invariants

Invariant test failure

Expected empty, got: 2 [\uA7F1\U0001B126] In [\p{NFKC_QC=No}-\p{Deprecated}-\p{Default_Ignorable_Code_Point}] But Not In \p{Identifier_Type=Not_NFKC} A7F1 # (�) MODIFIER LETTER CAPITAL S 1B126 # (�) KATAKANA DIGRAPH YORI
Let $Strongly_Restricted := [\p{Identifier_Type=Not_Character}\p{Identifier_Type=Deprecated}\p{Identifier_Type=Default_Ignorable}\p{Identifier_Type=Not_NFKC}]