Skip to content

DEVANAGARI LETTER YET ANOTHER DA #1037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt

Check warning on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

GitHub Actions / Draft unless approved

Not in the 17.0 pipeline

While the Unicode Technical Committee has provisionally assigned these characters, they have not been accepted for Unicode 17.0, nor for any specific version of Unicode. The Age property values for new characters are likely incorrect right now. They will be recomputed after the UTC accepts their encoding and this pull request is updated for the target version.

Check warning on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

GitHub Actions / Draft unless approved

Not in the 17.0 pipeline

While the Unicode Technical Committee has provisionally assigned these characters, they have not been accepted for Unicode 17.0, nor for any specific version of Unicode. The Age property values for new characters are likely incorrect right now. They will be recomputed after the UTC accepts their encoding and this pull request is updated for the target version.

Check warning on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

GitHub Actions / Draft unless approved

Not in the 17.0 pipeline

While the Unicode Technical Committee has provisionally assigned these characters, they have not been accepted for Unicode 17.0, nor for any specific version of Unicode. The Age property values for new characters are likely incorrect right now. They will be recomputed after the UTC accepts their encoding and this pull request is updated for the target version.
# Date: 2025-01-27, 18:09:08 GMT
# Date: 2025-02-11, 18:10:55 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2084,6 +2084,7 @@
10EC5..10EC7 ; 17.0 # [3] ARABIC SMALL YEH BARREE WITH TWO DOTS BELOW..ARABIC LETTER YEH WITH FOUR DOTS BELOW
10ED0..10ED8 ; 17.0 # [9] ARABIC BIBLICAL END OF VERSE..ARABIC LIGATURE NAWWARA ALLAAHU MARQADAH
10EFA..10EFB ; 17.0 # [2] ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW NOON
11B0A ; 17.0 # DEVANAGARI LETTER ALTERNATE DDDA
11B60..11B67 ; 17.0 # [8] SHARADA VOWEL SIGN OE..SHARADA VOWEL SIGN CANDRA O
11DB0..11DDB ; 17.0 # [44] TOLONG SIKI LETTER I..TOLONG SIKI UNGGA
11DE0..11DE9 ; 17.0 # [10] TOLONG SIKI DIGIT ZERO..TOLONG SIKI DIGIT NINE
@@ -2116,6 +2117,6 @@
2B73A..2B73E ; 17.0 # [5] CJK UNIFIED IDEOGRAPH-2B73A..CJK UNIFIED IDEOGRAPH-2B73E
323B0..33479 ; 17.0 # [4298] CJK UNIFIED IDEOGRAPH-323B0..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 4836
# Total code points: 4837

# EOF
23 changes: 15 additions & 8 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-17.0.0.txt
# Date: 2025-01-27, 18:09:11 GMT
# Date: 2025-02-11, 18:11:15 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1242,6 +1242,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
11A97 ; Alphabetic # Mc SOYOMBO SIGN VISARGA
11A9D ; Alphabetic # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; Alphabetic # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; Alphabetic # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; Alphabetic # Mn SHARADA VOWEL SIGN OE
11B61 ; Alphabetic # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; Alphabetic # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
@@ -1471,7 +1472,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
30000..3134A ; Alphabetic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 147441
# Total code points: 147442

# ================================================

@@ -6874,6 +6875,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
11A5C..11A89 ; ID_Start # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; ID_Start # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; ID_Start # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; ID_Start # Lo DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; ID_Start # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11C00..11C08 ; ID_Start # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E ; ID_Start # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@@ -7044,7 +7046,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
30000..3134A ; ID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 145935
# Total code points: 145936

# ================================================

@@ -8206,6 +8208,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
11A98..11A99 ; ID_Continue # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
11A9D ; ID_Continue # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; ID_Continue # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; ID_Continue # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; ID_Continue # Mn SHARADA VOWEL SIGN OE
11B61 ; ID_Continue # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; ID_Continue # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
@@ -8484,7 +8487,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
31350..33479 ; ID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 149273
# Total code points: 149274

# ================================================

@@ -9109,6 +9112,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
11A5C..11A89 ; XID_Start # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; XID_Start # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; XID_Start # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; XID_Start # Lo DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; XID_Start # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11C00..11C08 ; XID_Start # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E ; XID_Start # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@@ -9279,7 +9283,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
30000..3134A ; XID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 145912
# Total code points: 145913

# ================================================

@@ -10442,6 +10446,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
11A98..11A99 ; XID_Continue # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
11A9D ; XID_Continue # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; XID_Continue # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; XID_Continue # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; XID_Continue # Mn SHARADA VOWEL SIGN OE
11B61 ; XID_Continue # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; XID_Continue # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
@@ -10720,7 +10725,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
31350..33479 ; XID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 149254
# Total code points: 149255

# ================================================

@@ -12692,6 +12697,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
11A9E..11AA2 ; Grapheme_Base # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
11AB0..11AF8 ; Grapheme_Base # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; Grapheme_Base # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; Grapheme_Base # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B61 ; Grapheme_Base # Mc SHARADA VOWEL SIGN OOE
11B65 ; Grapheme_Base # Mc SHARADA VOWEL SIGN SHORT O
11B67 ; Grapheme_Base # Mc SHARADA VOWEL SIGN CANDRA O
@@ -13016,7 +13022,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
30000..3134A ; Grapheme_Base # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 157523
# Total code points: 157524

# ================================================

@@ -13148,8 +13154,9 @@ ABED ; Grapheme_Link # Mn MEETEI MAYEK APUN IYEK
0C2A..0C39 ; InCB; Consonant # Lo [16] TELUGU LETTER PA..TELUGU LETTER HA
0C58..0C5A ; InCB; Consonant # Lo [3] TELUGU LETTER TSA..TELUGU LETTER RRRA
0D15..0D3A ; InCB; Consonant # Lo [38] MALAYALAM LETTER KA..MALAYALAM LETTER TTTA
11B0A ; InCB; Consonant # Lo DEVANAGARI LETTER ALTERNATE DDDA

# Total code points: 241
# Total code points: 242

# ================================================

3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-17.0.0.txt
# Date: 2025-01-27, 18:09:15 GMT
# Date: 2025-02-11, 18:11:20 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2240,6 +2240,7 @@ FFFD ; A # So REPLACEMENT CHARACTER
11AB0..11ABF ; N # Lo [16] CANADIAN SYLLABICS NATTILIK HI..CANADIAN SYLLABICS SPA
11AC0..11AF8 ; N # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; N # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; N # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; N # Mn SHARADA VOWEL SIGN OE
11B61 ; N # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; N # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/IndicSyllabicCategory.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# IndicSyllabicCategory-17.0.0.txt
# Date: 2025-01-27, 18:09:16 GMT
# Date: 2025-02-11, 18:11:22 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -972,6 +972,7 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
119AE..119D0 ; Consonant # Lo [35] NANDINAGARI LETTER KA..NANDINAGARI LETTER RRA
11A0B..11A32 ; Consonant # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
11A5C..11A83 ; Consonant # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
11B0A ; Consonant # Lo DEVANAGARI LETTER ALTERNATE DDDA
11C0E..11C2E ; Consonant # Lo [33] BHAIKSUKI LETTER KA..BHAIKSUKI LETTER HA
11C72..11C8F ; Consonant # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A
11D0C..11D30 ; Consonant # Lo [37] MASARAM GONDI LETTER KA..MASARAM GONDI LETTER TRA
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-17.0.0.txt
# Date: 2025-01-27, 18:09:16 GMT
# Date: 2025-02-11, 18:11:22 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -3121,6 +3121,7 @@ FFFD ; AI # So REPLACEMENT CHARACTER
11AB0..11ABF ; AL # Lo [16] CANADIAN SYLLABICS NATTILIK HI..CANADIAN SYLLABICS SPA
11AC0..11AF8 ; AL # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; BB # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; AL # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; CM # Mn SHARADA VOWEL SIGN OE
11B61 ; CM # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; CM # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/Scripts.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Scripts-17.0.0.txt
# Date: 2025-01-27, 18:09:39 GMT
# Date: 2025-02-11, 18:11:50 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -987,8 +987,9 @@ A8FC ; Devanagari # Po DEVANAGARI SIGN SIDDHAM
A8FD..A8FE ; Devanagari # Lo [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY
A8FF ; Devanagari # Mn DEVANAGARI VOWEL SIGN AY
11B00..11B09 ; Devanagari # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; Devanagari # Lo DEVANAGARI LETTER ALTERNATE DDDA

# Total code points: 164
# Total code points: 165

# ================================================

1 change: 1 addition & 0 deletions unicodetools/data/ucd/dev/UnicodeData.txt
Original file line number Diff line number Diff line change
@@ -21628,6 +21628,7 @@ FFFD;REPLACEMENT CHARACTER;So;0;ON;;;;;N;;;;;
11B07;DEVANAGARI SIGN WESTERN NINE-LIKE BHALE;Po;0;L;;;;;N;;;;;
11B08;DEVANAGARI SIGN REVERSED NINE-LIKE BHALE;Po;0;L;;;;;N;;;;;
11B09;DEVANAGARI SIGN MINDU;Po;0;L;;;;;N;;;;;
11B0A;DEVANAGARI LETTER ALTERNATE DDDA;Lo;0;L;;;;;N;;;;;
11B60;SHARADA VOWEL SIGN OE;Mn;0;NSM;;;;;N;;;;;
11B61;SHARADA VOWEL SIGN OOE;Mc;0;L;;;;;N;;;;;
11B62;SHARADA VOWEL SIGN UE;Mn;0;NSM;;;;;N;;;;;
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/VerticalOrientation.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# VerticalOrientation-17.0.0.txt
# Date: 2025-01-29
# Date: 2025-02-11, 18:11:53 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2072,6 +2072,7 @@ FFFC..FFFD ; U # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARA
11AB0..11ABF ; U # Lo [16] CANADIAN SYLLABICS NATTILIK HI..CANADIAN SYLLABICS SPA
11AC0..11AF8 ; R # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; R # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; R # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; R # Mn SHARADA VOWEL SIGN OE
11B61 ; R # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; R # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/auxiliary/SentenceBreakProperty.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# SentenceBreakProperty-17.0.0.txt
# Date: 2025-01-27, 18:09:39 GMT
# Date: 2025-02-11, 18:11:51 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2490,6 +2490,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
11A5C..11A89 ; OLetter # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; OLetter # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; OLetter # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; OLetter # Lo DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; OLetter # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11C00..11C08 ; OLetter # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E ; OLetter # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@@ -2622,7 +2623,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
30000..3134A ; OLetter # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; OLetter # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 141520
# Total code points: 141521

# ================================================

5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/auxiliary/WordBreakProperty.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# WordBreakProperty-17.0.0.txt
# Date: 2025-01-27, 18:09:43 GMT
# Date: 2025-02-11, 18:11:53 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1233,6 +1233,7 @@ FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
11A5C..11A89 ; ALetter # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; ALetter # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; ALetter # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; ALetter # Lo DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; ALetter # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11C00..11C08 ; ALetter # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E ; ALetter # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@@ -1383,7 +1384,7 @@ FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
1F150..1F169 ; ALetter # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z
1F170..1F189 ; ALetter # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z

# Total code points: 34004
# Total code points: 34005

# ================================================

5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/extracted/DerivedBidiClass.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedBidiClass-17.0.0.txt
# Date: 2025-01-27, 18:09:10 GMT
# Date: 2025-02-11, 18:11:13 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1010,6 +1010,7 @@ FFDA..FFDC ; L # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER
11A9E..11AA2 ; L # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
11AB0..11AF8 ; L # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; L # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; L # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B61 ; L # Mc SHARADA VOWEL SIGN OOE
11B65 ; L # Mc SHARADA VOWEL SIGN SHORT O
11B67 ; L # Mc SHARADA VOWEL SIGN CANDRA O
@@ -1234,7 +1235,7 @@ FFDA..FFDC ; L # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER
F0000..FFFFD ; L # Co [65534] <private-use-F0000>..<private-use-FFFFD>
100000..10FFFD; L # Co [65534] <private-use-100000>..<private-use-10FFFD>

# The above property value applies to 810584 code points not listed here.
# The above property value applies to 810583 code points not listed here.
# Total code points: 1095402

# ================================================
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/extracted/DerivedCombiningClass.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCombiningClass-17.0.0.txt
# Date: 2025-01-27, 18:09:10 GMT
# Date: 2025-02-11, 18:11:14 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1724,6 +1724,7 @@ FFFC..FFFD ; 0 # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
11A9E..11AA2 ; 0 # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
11AB0..11AF8 ; 0 # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; 0 # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; 0 # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; 0 # Mn SHARADA VOWEL SIGN OE
11B61 ; 0 # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; 0 # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
@@ -2095,7 +2096,7 @@ E0100..E01EF ; 0 # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
F0000..FFFFD ; 0 # Co [65534] <private-use-F0000>..<private-use-FFFFD>
100000..10FFFD; 0 # Co [65534] <private-use-100000>..<private-use-10FFFD>

# The above property value applies to 816745 code points not listed here.
# The above property value applies to 816744 code points not listed here.
# Total code points: 1113143

# ================================================
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/extracted/DerivedEastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedEastAsianWidth-17.0.0.txt
# Date: 2025-01-27, 18:09:12 GMT
# Date: 2025-02-11, 18:11:17 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1760,6 +1760,7 @@ FFFC ; N # So OBJECT REPLACEMENT CHARACTER
11A9E..11AA2 ; N # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
11AB0..11AF8 ; N # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B00..11B09 ; N # Po [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
11B0A ; N # Lo DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; N # Mn SHARADA VOWEL SIGN OE
11B61 ; N # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; N # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
@@ -2144,7 +2145,7 @@ FFFC ; N # So OBJECT REPLACEMENT CHARACTER
E0001 ; N # Cf LANGUAGE TAG
E0020..E007F ; N # Cf [96] TAG SPACE..CANCEL TAG

# The above property value applies to 760566 code points not listed here.
# The above property value applies to 760565 code points not listed here.
# Total code points: 792267

# ================================================
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedGeneralCategory-17.0.0.txt
# Date: 2025-01-27, 18:09:13 GMT
# Date: 2025-02-11, 18:11:17 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -516,7 +516,7 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
11A48..11A4F ; Cn # [8] <reserved-11A48>..<reserved-11A4F>
11AA3..11AAF ; Cn # [13] <reserved-11AA3>..<reserved-11AAF>
11AF9..11AFF ; Cn # [7] <reserved-11AF9>..<reserved-11AFF>
11B0A..11B5F ; Cn # [86] <reserved-11B0A>..<reserved-11B5F>
11B0B..11B5F ; Cn # [85] <reserved-11B0B>..<reserved-11B5F>
11B68..11BBF ; Cn # [88] <reserved-11B68>..<reserved-11BBF>
11BE2..11BEF ; Cn # [14] <reserved-11BE2>..<reserved-11BEF>
11BFA..11BFF ; Cn # [6] <reserved-11BFA>..<reserved-11BFF>
@@ -754,7 +754,7 @@ E01F0..EFFFF ; Cn # [65040] <reserved-E01F0>..<noncharacter-EFFFF>
FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
10FFFE..10FFFF; Cn # [2] <noncharacter-10FFFE>..<noncharacter-10FFFF>

# Total code points: 814697
# Total code points: 814696

# ================================================

@@ -2623,6 +2623,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
11A5C..11A89 ; Lo # [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; Lo # SOYOMBO MARK PLUTA
11AB0..11AF8 ; Lo # [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; Lo # DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; Lo # [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11C00..11C08 ; Lo # [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E ; Lo # [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@@ -2738,7 +2739,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
30000..3134A ; Lo # [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..33479 ; Lo # [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 141081
# Total code points: 141082

# ================================================

9 changes: 5 additions & 4 deletions unicodetools/data/ucd/dev/extracted/DerivedLineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedLineBreak-17.0.0.txt
# Date: 2025-01-27, 18:09:13 GMT
# Date: 2025-02-11, 18:11:18 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -70,8 +70,8 @@ E000..F8FF ; XX # Co [6400] <private-use-E000>..<private-use-F8FF>
F0000..FFFFD ; XX # Co [65534] <private-use-F0000>..<private-use-FFFFD>
100000..10FFFD; XX # Co [65534] <private-use-100000>..<private-use-10FFFD>

# The above property value applies to 757136 code points not listed here.
# Total code points: 894604
# The above property value applies to 757135 code points not listed here.
# Total code points: 894603

# ================================================

@@ -1397,6 +1397,7 @@ FFED..FFEE ; AL # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CIRCLE
11A5C..11A89 ; AL # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A9D ; AL # Lo SOYOMBO MARK PLUTA
11AB0..11AF8 ; AL # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
11B0A ; AL # Lo DEVANAGARI LETTER ALTERNATE DDDA
11BC0..11BE0 ; AL # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
11BE1 ; AL # Po SUNUWAR SIGN PVO
11C00..11C08 ; AL # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
@@ -1642,7 +1643,7 @@ FFED..FFEE ; AL # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CIRCLE
1FB94..1FBEF ; AL # So [92] LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK..TOP LEFT JUSTIFIED LOWER RIGHT QUARTER BLACK CIRCLE
1FBFA ; AL # So ALARM BELL SYMBOL

# Total code points: 26987
# Total code points: 26988

# ================================================

5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/extracted/DerivedName.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedName-17.0.0.txt
# Date: 2025-01-27, 18:09:14 GMT
# Date: 2025-02-11, 18:11:18 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -32293,6 +32293,7 @@ FFFD ; REPLACEMENT CHARACTER
11B07 ; DEVANAGARI SIGN WESTERN NINE-LIKE BHALE
11B08 ; DEVANAGARI SIGN REVERSED NINE-LIKE BHALE
11B09 ; DEVANAGARI SIGN MINDU
11B0A ; DEVANAGARI LETTER ALTERNATE DDDA
11B60 ; SHARADA VOWEL SIGN OE
11B61 ; SHARADA VOWEL SIGN OOE
11B62 ; SHARADA VOWEL SIGN UE
@@ -45870,6 +45871,6 @@ E01ED ; VARIATION SELECTOR-254
E01EE ; VARIATION SELECTOR-255
E01EF ; VARIATION SELECTOR-256

# Total code points: 159834
# Total code points: 159835

# EOF
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# DEVANAGARI LETTER ALTERNATE DDDA (11B0A)
# https://github.com/unicode-org/utc-release-management/issues/182

# Names always differ.
# Age always differs since these tests are comparing additions to pre-existing characters.
Ignoring Name Age:

# Ignore the security and IDNA properties, as these are not yet included for provisionally assigned characters.
Ignoring Confusable_MA Identifier_Status Identifier_Type Idn_Status Idn_Mapping Idn_2008:

Ignoring Block:
Propertywise [
\x{0921} द \N{DEVANAGARI LETTER DA}
\x{0926} ड \N{DEVANAGARI LETTER DDA}
\x{097E} ॾ \N{DEVANAGARI LETTER DDDA}
\x{11B0A} \N{DEVANAGARI LETTER ALTERNATE DDDA}
] AreAlike
end Ignoring;

end Ignoring;

end Ignoring;

Unchanged files with check annotations Beta

# https://www.unicode.org/reports/tr39/#Identifier_Status_and_Type
# “Unassigned characters, private use characters, surrogates, non-whitespace control characters.”
\p{Identifier_Type=Not_Character} = [\p{gc=Cn}\p{gc=Co}\p{gc=Cs}\p{gc=Cc}-\p{White_Space}]

Check notice on line 9 in unicodetools/src/main/resources/org/unicode/text/UCD/SecurityInvariantTest.txt

GitHub Actions / Check security data invariants

Invariant test failure

Expected empty, got: 4837 [\u088F\u09FF\u0B53\u0B54\u0C5C\u0CDC\u1ACF-\u1ADD\u1AE0-\u1AEB\u2B96\uA7CE\uA7CF\uA7D2\uA7D4\uA7F1\uFBC3-\uFBD2\uFD90\uFD91\uFDC8-\uFDCE\U00010940-\U0001095C\U00010EC5-\U00010EC7\U00010ED0-\U00010ED8\U00010EFA\U00010EFB\U00011B0A\U00011B60-\U00011B67\U00011DB0-\U00011DDB\U00011DE0-\U00011DE9\U00016D80-\U00016D9D\U00016DA0-\U00016DA9\U00016EA0-\U00016EB8\U00016EBB-\U00016ED3\U00016FF2-\U00016FF6\U000187F8-\U000187FF\U00018D09-\U00018D1E\U00018D80-\U00018DF2\U0001CCFA-\U0001CCFC\U0001CEBA-\U0001CED0\U0001CEE0-\U0001CEF0\U0001E6C0-\U0001E6DE\U0001E6E0-\U0001E6F5\U0001E6FE\U0001E6FF\U0001F6D8\U0001F777-\U0001F77A\U0001F8D0-\U0001F8D8\U0001FA54-\U0001FA57\U0001FA8A\U0001FA8E\U0001FAC8\U0001FACD\U0001FADD\U0001FAEA\U0001FAEF\U0001FBFA\U0002B73A-\U0002B73E\U000323B0-\U00033479] In \p{Identifier_Type=Not_Character} But Not In [\p{gc=Cn}\p{gc=Co}\p{gc=Cs}\p{gc=Cc}-\p{White_Space}] 088F # (�) ARABIC LETTER NOON WITH RING ABOVE 09FF # (�) BENGALI LETTER SANSKRIT BA 0B53..0B54 # [2] (�..�) ORIYA SIGN DOT ABOVE..ORIYA SIGN DOUBLE DOT ABOVE 0C5C # (�) TELUGU ARCHAIC SHRII 0CDC # (�) KANNADA ARCHAIC SHRII 1ACF..1ADD # [15] (�..�) COMBINING DOUBLE CARON..COMBINING DOT-AND-RING BELOW 1AE0..1AEB # [12] (�..�) COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE 2B96 # (�) EQUALS SIGN WITH INFINITY ABOVE A7CE..A7CF # [2] (�..�) LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE..LATIN SMALL LETTER PHARYNGEAL VOICED FRICATIVE A7D2 # (�) LATIN CAPITAL LETTER DOUBLE THORN A7D4 # (�) LATIN CAPITAL LETTER DOUBLE WYNN A7F1 # (�) MODIFIER LETTER CAPITAL S FBC3..FBD2 # [16] (�..�) ARABIC LIGATURE JALLA WA-ALAA..ARABIC LIGATURE ALAYHI AR-RAHMAH FD90..FD91 # [2] (�..�) ARABIC LIGATURE RAHMATU ALLAAHI ALAYH..ARABIC LIGATURE RAHMATU ALLAAHI ALAYHAA FDC8..FDCE # [7] (�..�) ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIGATURE KARRAMA ALLAAHU WAJHAH 10940..1095C # [29] (�..�) SIDETIC LETTER N01..SIDETIC LETTER N29 10EC5..10EC7 # [3] (�..�) ARABIC SMALL YEH BARREE WITH TWO DOTS BELOW..ARABIC LETTER YEH WITH FOUR DOTS BELOW 10ED0..10ED8 # [9] (�..�) ARABIC BIBLICAL END OF VERSE..ARABIC LIGATURE NAWWARA ALLAAHU MARQADAH 10EFA..10EFB # [2] (�..�) ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW NOON 11B0A # (�) DEVANAGARI LETTER ALTERNATE DDDA 11B60..11B67 # [8] (�..�) SHARADA VOWEL SIGN OE..SHARADA VOWEL SIGN CANDRA O 11DB0..11DDB # [44] (�..�) TOLONG SIKI LETTER I..TOLONG SIKI UNGGA 11DE0..11DE9 # [10] (�..�) TOLONG SIKI DIGIT ZERO..TOLONG SIKI DIGIT NINE 16D80..16D9D # [30] (�..�) CHISOI LETTER A..CHISOI SIGN SISO 16DA0..16DA9 # [10] (�..�) CHISOI DIGIT ZERO..CHISOI DIGIT NINE 16EA0..16EB8 # [25] (�..�) BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY 16EBB..16ED3 # [25] (�..�) BERIA ERFE SMALL LETTER ARKAB..BERIA ERFE SMALL LETTER AY 16FF2..16FF6 # [5] (�..�) CHINESE SMALL SIMPLIFIED ER..YANGQIN SIGN SLOW TWO BEATS 187F8..187FF # [8] (�..�) TANGUT IDEOGRAPH-187F8..TANGUT IDEOGRAPH-187FF 18D09..18D1E # [22] (�..�) TANGUT IDEOGRAPH-18D09..TANGUT IDEOGRAPH-18D1E 18D80..18DF2 # [115] (�..�) TANGUT COMPONENT-769..TANGUT COMPONENT-883 1CCFA..1CCFC # [3] (�..�) SNAKE SYMBOL..NOSE SYMBOL 1CEBA..1CED0 # [23] (�..�) FRAGILE SYMBOL..LEUKOTHEA 1CEE0..1CEF0 # [17] (�..�) GEOMANTIC FIGURE POPULUS..MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR 1E6C0..1E6DE # [31] (�..�) TAI YO LETTER LOW KO..TAI YO LETTER HIGH KVO 1E6E0..1E6F5 # [22] (�..�) TAI YO LETTER AA..TAI YO SIGN OM 1E6FE..1E6FF # [2] (�..�) TAI YO SYMBOL MUEANG..TAI YO XAM LAI 1F6D8 # (�) LANDSLIDE 1F777..1F77A # [4] (�..�) VESTA FORM TWO..PARTHENOPE FORM TWO 1F8D0..1F8D8 # [9] (�..�) LONG RIGHTWARDS ARROW OVER LONG LEFTWARDS ARROW..LONG LEFT RIGHT ARROW WITH DEPENDENT LOBE 1FA54..1FA57 # [4] (�..�) WHITE CHESS FERZ..BLACK CHESS ALFIL 1FA8A # (�) TROM
# “Multiple values are not assigned to characters with strong restrictions:
# Not_Character, Deprecated, Default_Ignorable, Not_NFKC.”
# For example, Default_Ignorable is trumped by unassigned and Deprecated.
\p{Identifier_Type=Default_Ignorable} = [\p{Default_Ignorable_Code_Point}-\p{gc=Cn}-\p{Deprecated}]
\p{Identifier_Type=Not_NFKC} = [\p{NFKC_QC=No}-\p{Deprecated}-\p{Default_Ignorable_Code_Point}]

Check notice on line 19 in unicodetools/src/main/resources/org/unicode/text/UCD/SecurityInvariantTest.txt

GitHub Actions / Check security data invariants

Invariant test failure

Expected empty, got: 1 [\uA7F1] In [\p{NFKC_QC=No}-\p{Deprecated}-\p{Default_Ignorable_Code_Point}] But Not In \p{Identifier_Type=Not_NFKC} A7F1 # (�) MODIFIER LETTER CAPITAL S
Let $Strongly_Restricted := [\p{Identifier_Type=Not_Character}\p{Identifier_Type=Deprecated}\p{Identifier_Type=Default_Ignorable}\p{Identifier_Type=Not_NFKC}]