-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lower right of 舞 #116
Comments
It is the same component but customarily written as different forms due to an inconsistency in Japanese kanji standardization. There is no semantic difference and the distinction is unifiable for the purposes of ISO10646 standardization. |
If you want to decompose glyphs exactly as they look in various standards, you may want to check out yi-bai/ids which decomposes characters down to the stroke level and has data indicating stroke joining behaviour. |
OK but the requirement and details for that specification isn't documented here. If it's required to fit that spec, at the least document it. |
Unfortunately the maintainer has not been able to update the repository :/ This repository is the main data source used for IRG IDS algorithm, though the decomposition data is also useful for other purposes. That's why the IDSs used are more vague. |
This? https://github.com/yi-bai/ids It contains some information on this particular character:
I'm not sure what that (.) all means yet, and the above doesn't accord with my own findings, but thank you for the pointer. |
I don't have information except that @kawabata contributed to a project in January 2021 so I assume he is in good health.
Is that kawabata's purpose of making the repository? It seems undocumented, queries to the mailing list went unanswered, and so on. If this repository is intended for your purpose then at least it should say so. I will leave this bug report open for the time being pending guidance "from above". |
I believe this repository was born before it was used for IRG standardization, however I am not sure because I joined IRG much later than Kawabata-san. If you refer to the the IRG working documents (https://appsrv.cse.cuhk.edu.hk/~irg/irg/irg56/IRG56.htm), you can see that this repo is used listed as the official IDS equivalence database for conducting CJK Unification. The decomposition strategies used for IDS data to be used for CJK standardization purposes are specified in IRGN1183 in IRG#25, written by @kawabata himself: https://appsrv.cse.cuhk.edu.hk/~irg/irg/irg25/IRG25.htm. Refer to paragraph 2.5 of the decomposition strategy which would be relevant to this case:
Though recently IDS check maintenance for IRG's standardization purposes has been passed to @yi-bai because @kawabata is busy. He maintains a proprietary format for IDSes. You may want to consult with him to see if he wants to increase his coverage for other locales. |
The lower right corner of 舞 seems to diverge between Chinese and Japanese.
Japanese seems to write 舞 with the lower right as four strokes:
https://kakijun.jp/page/15116200.html
But the 㐄 element seems to be three strokes in the same Japanese sources:
https://kakijun.jp/page/masu06200.html
In ids.txt they are unified onto one thing, but there seems to be an actual difference.
The text was updated successfully, but these errors were encountered: