-
-
Notifications
You must be signed in to change notification settings - Fork 58
Remove or correct non-NFD confusables #1238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sorry, this is a little hard to review, but be assured that I triple-checked it myself very thoroughly. confusablesSummary.txt is relatively easier to check compared to other files, but it's still pretty long. Anyways, please take a look. |
ce6b7d3 to
a931024
Compare
|
Converting to draft to fix Java style issues. |
a931024 to
22058ca
Compare
|
Java style fixed. Ready for review. |
josh-hadley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roozbehp, apologies for taking a long time to get this reviewed. It was tricky as advertised, but all looks good. The process did make me want to see if a one-time revamp of confusables-source.txt to force most or all entries to our new preferred format universally would be worthwhile (presumably making reviews/maintenance like this a little less taxing).
|
Note that we need to add confusable changes to the Migration section of each release. That is, if the skeleton(X) changes from Y to Z, that requires implementations that have mappings to skeletons need to update. (We had some breakages of indexes in production softward with the U17 integrations; luckily caught by unittests.) |
We're going to have a lot of confusable changes in Unicode 18.0, but this specific pull request should not affect the skeleton of any string, since a conformant implementation would not use any of the data removed in this pull request: They simply don't occur in NFD form that the algorithm applies before looking for the prototypes. |
|
I'm not worried about dropping all the non-nfd forms. I just want to make
sure that we alert people of changes, and that anytime we had X -> Y
before, we have nfd(X) -> nfd(Y) after (when nfd(X) is a single code point
-- and you include a onetime test that that is true.
…On Mon, Nov 24, 2025, 18:33 Roozbeh Pournader ***@***.***> wrote:
*roozbehp* left a comment (unicode-org/unicodetools#1238)
<#1238 (comment)>
Note that we need to add confusable changes to the Migration section of
each release. That is, if the skeleton(X) changes from Y to Z, that
requires implementations that have mappings to skeletons need to update.
We're going to have a lot of confusable changes in Unicode 18.0, but this
specific pull request should not affect the skeleton of any string, since a
conformant implementation would not use any of the data removed in this
pull request: They simply don't occur in NFD form that the algorithm
applies before looking for the prototypes.
—
Reply to this email directly, view it on GitHub
<#1238 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMCGVFUS42MPA5ORERD36O5YDAVCNFSM6AAAAACL6VXZVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNZTGU2DIMJWGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
See https://github.com/unicode-org/properties/issues/486