Skip to content

Commit 4e5c6fc

Browse files
authored
Modifier ᶁꞕᶇᶊᶎ (#887)
* UnicodeData.txt lines from L2/24-144 * lb=AL like U+1DAA and U+1DB5 * Latin * Other_Lowercase. Somehow U+1DAA and U+1DB5 are not Diacritic, so keep the new ones consistent with that… * Regenerate UCD * DoNotEmit.txt lines from L2/24-144 * Make them Diacritic after all considering unicode-org/properties#315 * Regenerate UCD * Comparison test * Ignore Unicode_1_Name * Ignore IDNA2008_Category * The hand-merging of DoNotEmit.txt will continue until morale improves. * Do not ignore Diacritic
1 parent 09d18e4 commit 4e5c6fc

21 files changed

+136
-76
lines changed

unicodetools/data/ucd/dev/DerivedAge.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedAge-18.0.0.txt
2-
# Date: 2025-11-23, 08:15:51 GMT
2+
# Date: 2025-11-24, 16:11:16 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2142,12 +2142,12 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
21422142
1DF1F..1DF24 ; 18.0 # [6] LATIN SMALL LETTER D-ETH DIGRAPH..LATIN SMALL LETTER T-THETA DIGRAPH
21432143
1DF2B..1DF56 ; 18.0 # [44] LATIN SMALL LETTER DEZH DIGRAPH WITH CURL..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
21442144
1DFD1..1DFF2 ; 18.0 # [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
2145-
1DFFA..1DFFF ; 18.0 # [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
2145+
1DFF5..1DFFF ; 18.0 # [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
21462146
1F7DB ; 18.0 # BULLET IN DOUBLE CIRCLE
21472147
1F7F1..1F7FF ; 18.0 # [15] CIRCLE WITH DOUBLE VERTICAL AND HORIZONTAL LINE..RHOMBUS
21482148
2B81E ; 18.0 # CJK UNIFIED IDEOGRAPH-2B81E
21492149
3D000..3FC3F ; 18.0 # [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
21502150

2151-
# Total code points: 11843
2151+
# Total code points: 11848
21522152

21532153
# EOF

unicodetools/data/ucd/dev/DerivedCoreProperties.txt

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedCoreProperties-18.0.0.txt
2-
# Date: 2025-11-23, 08:16:13 GMT
2+
# Date: 2025-11-24, 16:11:39 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1395,7 +1395,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
13951395
1DF0A ; Alphabetic # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
13961396
1DF0B..1DF56 ; Alphabetic # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
13971397
1DFD1..1DFF2 ; Alphabetic # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
1398-
1DFFA..1DFFF ; Alphabetic # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
1398+
1DFF5..1DFFF ; Alphabetic # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
13991399
1E000..1E006 ; Alphabetic # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
14001400
1E008..1E018 ; Alphabetic # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
14011401
1E01B..1E021 ; Alphabetic # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
@@ -1477,7 +1477,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
14771477
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
14781478
3D000..3FC3F ; Alphabetic # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
14791479

1480-
# Total code points: 159227
1480+
# Total code points: 159232
14811481

14821482
# ================================================
14831483

@@ -2181,11 +2181,11 @@ FF41..FF5A ; Lowercase # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH L
21812181
1DF4E..1DF50 ; Lowercase # L& [3] LATIN SMALL LETTER BARRED N..LATIN SMALL LETTER TURNED R WITH STROKE
21822182
1DF52..1DF56 ; Lowercase # L& [5] LATIN SMALL LETTER BARRED V..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
21832183
1DFD1..1DFF2 ; Lowercase # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
2184-
1DFFA..1DFFF ; Lowercase # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
2184+
1DFF5..1DFFF ; Lowercase # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
21852185
1E030..1E06D ; Lowercase # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
21862186
1E922..1E943 ; Lowercase # L& [34] ADLAM SMALL LETTER ALIF..ADLAM SMALL LETTER SHA
21872187

2188-
# Total code points: 2683
2188+
# Total code points: 2688
21892189

21902190
# ================================================
21912191

@@ -3038,14 +3038,14 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
30383038
1DF00..1DF09 ; Cased # L& [10] LATIN SMALL LETTER FENG DIGRAPH WITH TRILL..LATIN SMALL LETTER T WITH HOOK AND RETROFLEX HOOK
30393039
1DF0B..1DF56 ; Cased # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
30403040
1DFD1..1DFF2 ; Cased # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
3041-
1DFFA..1DFFF ; Cased # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
3041+
1DFF5..1DFFF ; Cased # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
30423042
1E030..1E06D ; Cased # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
30433043
1E900..1E943 ; Cased # L& [68] ADLAM CAPITAL LETTER ALIF..ADLAM SMALL LETTER SHA
30443044
1F130..1F149 ; Cased # So [26] SQUARED LATIN CAPITAL LETTER A..SQUARED LATIN CAPITAL LETTER Z
30453045
1F150..1F169 ; Cased # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z
30463046
1F170..1F189 ; Cased # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z
30473047

3048-
# Total code points: 4725
3048+
# Total code points: 4730
30493049

30503050
# ================================================
30513051

@@ -3551,7 +3551,7 @@ FFF9..FFFB ; Case_Ignorable # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLI
35513551
1DA9B..1DA9F ; Case_Ignorable # Mn [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6
35523552
1DAA1..1DAAF ; Case_Ignorable # Mn [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16
35533553
1DFD1..1DFF2 ; Case_Ignorable # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
3554-
1DFFA..1DFFF ; Case_Ignorable # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
3554+
1DFF5..1DFFF ; Case_Ignorable # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
35553555
1E000..1E006 ; Case_Ignorable # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
35563556
1E008..1E018 ; Case_Ignorable # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
35573557
1E01B..1E021 ; Case_Ignorable # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
@@ -3579,7 +3579,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
35793579
E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
35803580
E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
35813581

3582-
# Total code points: 2849
3582+
# Total code points: 2854
35833583

35843584
# ================================================
35853585

@@ -7035,7 +7035,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
70357035
1DF0A ; ID_Start # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
70367036
1DF0B..1DF56 ; ID_Start # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
70377037
1DFD1..1DFF2 ; ID_Start # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
7038-
1DFFA..1DFFF ; ID_Start # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
7038+
1DFF5..1DFFF ; ID_Start # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
70397039
1E030..1E06D ; ID_Start # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
70407040
1E100..1E12C ; ID_Start # Lo [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W
70417041
1E137..1E13D ; ID_Start # Lm [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
@@ -7103,7 +7103,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
71037103
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
71047104
3D000..3FC3F ; ID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
71057105

7106-
# Total code points: 157719
7106+
# Total code points: 157724
71077107

71087108
# ================================================
71097109

@@ -8456,7 +8456,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
84568456
1DF0A ; ID_Continue # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
84578457
1DF0B..1DF56 ; ID_Continue # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
84588458
1DFD1..1DFF2 ; ID_Continue # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
8459-
1DFFA..1DFFF ; ID_Continue # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
8459+
1DFF5..1DFFF ; ID_Continue # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
84608460
1E000..1E006 ; ID_Continue # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
84618461
1E008..1E018 ; ID_Continue # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
84628462
1E01B..1E021 ; ID_Continue # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
@@ -8548,7 +8548,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
85488548
3D000..3FC3F ; ID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
85498549
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
85508550

8551-
# Total code points: 161065
8551+
# Total code points: 161070
85528552

85538553
# ================================================
85548554

@@ -9280,7 +9280,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
92809280
1DF0A ; XID_Start # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
92819281
1DF0B..1DF56 ; XID_Start # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
92829282
1DFD1..1DFF2 ; XID_Start # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
9283-
1DFFA..1DFFF ; XID_Start # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
9283+
1DFF5..1DFFF ; XID_Start # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
92849284
1E030..1E06D ; XID_Start # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
92859285
1E100..1E12C ; XID_Start # Lo [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W
92869286
1E137..1E13D ; XID_Start # Lm [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
@@ -9348,7 +9348,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
93489348
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
93499349
3D000..3FC3F ; XID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
93509350

9351-
# Total code points: 157696
9351+
# Total code points: 157701
93529352

93539353
# ================================================
93549354

@@ -10702,7 +10702,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1070210702
1DF0A ; XID_Continue # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
1070310703
1DF0B..1DF56 ; XID_Continue # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
1070410704
1DFD1..1DFF2 ; XID_Continue # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
10705-
1DFFA..1DFFF ; XID_Continue # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
10705+
1DFF5..1DFFF ; XID_Continue # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
1070610706
1E000..1E006 ; XID_Continue # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
1070710707
1E008..1E018 ; XID_Continue # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
1070810708
1E01B..1E021 ; XID_Continue # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
@@ -10794,7 +10794,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1079410794
3D000..3FC3F ; XID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1079510795
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
1079610796

10797-
# Total code points: 161046
10797+
# Total code points: 161051
1079810798

1079910799
# ================================================
1080010800

@@ -12966,7 +12966,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1296612966
1DF0A ; Grapheme_Base # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
1296712967
1DF0B..1DF56 ; Grapheme_Base # L& [76] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN LETTER GLOTTAL STOP WITH DOUBLE STROKE
1296812968
1DFD1..1DFF2 ; Grapheme_Base # Lm [34] MODIFIER LETTER SMALL CAPITAL P..MODIFIER LETTER SMALL T WITH CURL
12969-
1DFFA..1DFFF ; Grapheme_Base # Lm [6] MODIFIER LETTER SMALL C WITH HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
12969+
1DFF5..1DFFF ; Grapheme_Base # Lm [11] MODIFIER LETTER SMALL D WITH PALATAL HOOK..MODIFIER LETTER SMALL T WITH HOOK AND RETROFLEX HOOK
1297012970
1E030..1E06D ; Grapheme_Base # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
1297112971
1E100..1E12C ; Grapheme_Base # Lo [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W
1297212972
1E137..1E13D ; Grapheme_Base # Lm [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
@@ -13095,7 +13095,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1309513095
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
1309613096
3D000..3FC3F ; Grapheme_Base # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1309713097

13098-
# Total code points: 169325
13098+
# Total code points: 169330
1309913099

1310013100
# ================================================
1310113101

0 commit comments

Comments
 (0)