臼 again. #180

fontfish · 2019-01-31T02:43:21Z

Hello!

Github is not my forte and I don't understand how it works very well, so please forgive me if I'm making mistakes. I'm aware that I'm dredging up issues #94 and #121, but think that it would make more sense to reflect what seems to be the common stroke order for 臼. Essentially, swapping kanji/081fc.svg with kanji/081fc-HzFst.svg.

All the more recent and “official-seeming” documentation I can find on the topic lists the two central horizontal strokes as 3 and 5. Some examples:
https://kanji.jitenon.jp/kanjid/1978.html
漢語林第二版
Kodansha Kanji Learner's Dictionary Revised and Expanded (post 2010 edition)

Some sources I can find for writing it with the central horizontal strokes as 4 and 5, not including those that use KanjiVG (perhaps a testament to its usefulness):
New Japanese-English Character Dictionary (over 20 years old now)
https://kakijun.jp/page/usu200.html (as an acceptable variant)
https://漢字筆順.com/c008/0365.html (as an acceptable variant)

Again, my aim is simply to bring attention to the fact that it might be best to keep things in line with what appears to be a kind of consensus on the stroke order rather than to simply dredge up old issues.

For the sake of consistency, a list of kanji containing 臼. I don't know how many of them are in KanjiVG.
https://kanji.jitenon.jp/kousei/list.php?data=81fc

As an extra note, should the “element” field in the svg reflect the components used in the writing of the kanji or should it show the radicals/tsukuri of the kanji? In short, is it worth me opening an issue about the elements of 勇 being マ and 男 rather than 甬 and 力 (its original components), or is it considered correct as-is? (I'm not entirely sure what I can make of this looking at modern Japanese sources either, to be honest, so am happy to leave it if you think that best.)

wtn · 2021-04-20T19:01:39Z

All the more recent and “official-seeming” documentation I can find on the topic lists the two central horizontal strokes as 3 and 5.

Yes; 漢検 dictionary also agrees.

fontfish · 2021-04-21T22:47:48Z

Thanks for the comment! Unfortunately, I'm not sure how to change this myself, or even whether I should, though I do still think the default form should match that recommended by dictionaries and educational organisations in Japan.

fontfish · 2021-04-22T02:50:50Z

Having a go at making the changes in a fork!

benkasminbullock · 2022-03-25T03:11:14Z

As an extra note, should the “element” field in the svg reflect the components used in the writing of the kanji or should it show the radicals/tsukuri of the kanji? In short, is it worth me opening an issue about the elements of 勇 being マ and 男 rather than 甬 and 力 (its original components), or is it considered correct as-is? (I'm not entirely sure what I can make of this looking at modern Japanese sources either, to be honest, so am happy to leave it if you think that best.)

This is specifically about Japan so the Japanese format should be used, and it's a graphical resource rather than an etymological resource, so there is no point adding the Chinese format, regardless of whether it is the original. If you want to check, a good place to go is the IDS repository. For this character, we have

U+52C7 勇 ⿱甬力[GTV] ⿱⿱龴田力[JK]

which means that GTV (Mainland China, Taiwan/Hong Kong, and Vietnam) use the previous format, and Japan and Korea use the latter format.

This is one of the problems caused by Han unification which was an effort to fit all characters into 16 bits by unifying Japanese and Chinese characters together depending on their "origin". The 16 bits goal has since been abandoned by Unicode.

benkasminbullock · 2022-04-13T02:37:46Z

Sorry that got into a muddle. This should be done with #295.

fontfish · 2022-04-16T15:02:40Z

Thank you for the edits and explanation, and my apologies for my very slow reply.

Regarding 勇, what you say about this being a graphical resource makes sense. The real issue there may be how dictionaries using this information choose to present it, which is up to them to consider. Japanese dictionaries that I have checked list only 力 under the 部首 field, then list 甬 and 力 under the etymology/character explanation.

Thanks again.

SlugFiller · 2022-05-04T04:42:13Z

As I've just been bitten by this one, I want to take the opportunity to ask, should this also effect 諛 \u8ADB? It appears to be the same radical, suggesting strokes 10 and 11 should be swapped.

benkasminbullock · 2022-05-04T05:03:10Z

How did you get bitten? I want you to report anything as an issue if there is a problem.

As for 諛, yes, it is wrong. That seems to have been caused by an incorrect value of 𦥑 for kvg:element on the group with ID number kvg:08adb-g4, hence it was missed by the script when I did the overall change. What I'll do with that one is to fix the element value & run the script again. If that doesn't work I'll just edit it with a text editor.

Let me know if you find any more like that.

SlugFiller · 2022-05-04T05:50:26Z

How did you get bitten? I want you to report anything as an issue if there is a problem.

As I've previously mentioned, I'm creating a visual indexing method for kanji that associates every stroke with an English character. I was indexing based on the "latest" release, where 臼's stroke order is indexed "qrosfc". After updating to pre-release, I noticed the strokes were at a mismatch, and had to reindex it to "qrfofc", as well as going back and re-indexing a few dozen kanjis containing the pattern. That's where I noticed 諛's index "ifsfkocqrosfcvj" still matches the stroke order diagram, even though it should logically need changing to "ifsfkocqrfofcvj", assuming the same pattern.

The correct pattern was already present in 嫂, 搜, and 鑿. And I was honestly wondering why the two patterns.

I'm fundamentally using KanjiVG as an "authoritative source" on stroke orders, so any errors are a hard hit.

Let me know if you find any more like that.

I've so far indexed ~5800 kanji of my target JIS X 0208's ~6300, so about 90% chance that's all of them. But I'll keep in mind to watch out for this pattern in the remaining kanji.

benkasminbullock · 2022-05-05T05:46:48Z

I'm fundamentally using KanjiVG as an "authoritative source" on stroke orders, so any errors are a hard hit.

Unfortunately KanjiVG isn't an authoritative source, but hopefully with enough people reporting problems we can get it better. One thing I've tried to do is to remove some of the claims about KanjiVG giving the correct stroke order of kanji from the documentation and other pages. It's a best effort thing really. Since I started working on this repository in March, I've found some errors in Kanjidic, some errors in this repository, and so on and so forth.

I've so far indexed ~5800 kanji of my target JIS X 0208's ~6300, so about 90% chance that's all of them. But I'll keep in mind to watch out for this pattern in the remaining kanji.

Thank you. This issue should be fixed in the repo now:

b7c9365

I'm going to be cautious about making a new release, since I'm still not sure that the one I did before was OK. You can always just copy the repo data into your distribution XML file though. I've fixed up those Python scripts too, so you can probably make your own distribution-a-like files from the repo yourself.

SlugFiller · 2022-05-05T07:07:15Z

Unfortunately KanjiVG isn't an authoritative source, but hopefully with enough people reporting problems we can get it better.

I don't have much in terms of an alternative. Presumably, there's some document out there giving the officially approved stroke orders for each kanji. But not something I can easily reference, search, embed, and produce cross-sections of.

If there is something decently accessible, I can always compare it to my index and report any errors I find.

You can always just copy the repo data into your distribution XML file though. I've fixed up those Python scripts too, so you can probably make your own distribution-a-like files from the repo yourself.

I should probably rewrite my dependency on KanjiVG as a git submodule anyway. It's better for GitHub release, and would make updating to latest easier. It shouldn't be too difficult.

benkasminbullock added the Stroke order The order of strokes in the character label Mar 25, 2022

benkasminbullock mentioned this issue Apr 12, 2022

Fix usu #290

Closed

benkasminbullock closed this as completed Apr 13, 2022

benkasminbullock added the Duplicate A bug reported more than once label May 4, 2022

benkasminbullock reopened this May 4, 2022

benkasminbullock closed this as completed May 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

臼 again. #180

臼 again. #180

fontfish commented Jan 31, 2019

wtn commented Apr 20, 2021

Uh oh!

fontfish commented Apr 21, 2021

Uh oh!

fontfish commented Apr 22, 2021

Uh oh!

benkasminbullock commented Mar 25, 2022

Uh oh!

benkasminbullock commented Apr 13, 2022

Uh oh!

fontfish commented Apr 16, 2022

Uh oh!

SlugFiller commented May 4, 2022

Uh oh!

benkasminbullock commented May 4, 2022

Uh oh!

SlugFiller commented May 4, 2022

Uh oh!

benkasminbullock commented May 5, 2022

Uh oh!

SlugFiller commented May 5, 2022

Uh oh!

臼 again. #180

臼 again. #180

Comments

fontfish commented Jan 31, 2019

wtn commented Apr 20, 2021

Uh oh!

fontfish commented Apr 21, 2021

Uh oh!

fontfish commented Apr 22, 2021

Uh oh!

benkasminbullock commented Mar 25, 2022

Uh oh!

benkasminbullock commented Apr 13, 2022

Uh oh!

fontfish commented Apr 16, 2022

Uh oh!

SlugFiller commented May 4, 2022

Uh oh!

benkasminbullock commented May 4, 2022

Uh oh!

SlugFiller commented May 4, 2022

Uh oh!

benkasminbullock commented May 5, 2022

Uh oh!

SlugFiller commented May 5, 2022

Uh oh!