Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Unicode normalization on macOS #1063

Closed
wants to merge 1 commit into from
Closed

Fix Unicode normalization on macOS #1063

wants to merge 1 commit into from

Conversation

Snack-X
Copy link

@Snack-X Snack-X commented Dec 1, 2016

This is an fix for #586.

Incorrect rendering before normalization (current behavior):
2016-12-01 4 27 56

and correct rendering with normalization:
2016-12-01 4 28 19

I just met with this nice terminal and haven't look at the codebase deeply.
So adding normalization here may not be appropriate. There may be an unexpected side effects.
Honestly, I'm not sure.

@albinekb
Copy link
Contributor

this is a good read about it: http://unicode.org/faq/normalization.html

@chabou
Copy link
Collaborator

chabou commented Aug 14, 2017

Sorry for the delay.

@Snack-X thank you so much for your PR. Can you attach a txt file with your ls output in order to reproduce it (and test it with xterm, our hterm replacement)?

@Snack-X
Copy link
Author

Snack-X commented Aug 16, 2017

This is an old issue, but problem still remains. Original PR was made before the 1.0.0 release. Now times have passed and this PR won't be compatible with recent codebase.

Anyway, you can easily reproduce the same problem with reverse xxd.


First, hexadecimal values of each string.

> Buffer.from("にっぽん", "utf8")
<Buffer e3 81 ab e3 81 a3 e3 81 bd e3 82 93>
> Buffer.from("にっぽん".normalize("NFD"), "utf8")
<Buffer e3 81 ab e3 81 a3 e3 81 bb e3 82 9a e3 82 93>
> Buffer.from("ニッポン", "utf8")
<Buffer e3 83 8b e3 83 83 e3 83 9d e3 83 b3>
> Buffer.from("ニッポン".normalize("NFD"), "utf8")
<Buffer e3 83 8b e3 83 83 e3 83 9b e3 82 9a e3 83 b3>
> Buffer.from("대한민국", "utf8")
<Buffer eb 8c 80 ed 95 9c eb af bc ea b5 ad>
> Buffer.from("대한민국".normalize("NFD"), "utf8")
<Buffer e1 84 83 e1 85 a2 e1 84 92 e1 85 a1 e1 86 ab e1 84 86 e1 85 b5 e1 86 ab e1 84 80 e1 85 ae e1 86 a8>

And using reverse xxd, problem is reproducible.

$ echo "e3 81 ab e3 81 a3 e3 81 bd e3 82 93 0a" | xxd -r -p
にっぽん
$ echo "e3 81 ab e3 81 a3 e3 81 bb e3 82 9a e3 82 93 0a" | xxd -r -p
にっぽん
$ echo "e3 83 8b e3 83 83 e3 83 9d e3 83 b3 0a" | xxd -r -p
ニッポン
$ echo "e3 83 8b e3 83 83 e3 83 9b e3 82 9a e3 83 b3 0a" | xxd -r -p
ニッポン
$ echo "eb 8c 80 ed 95 9c eb af bc ea b5 ad 0a" | xxd -r -p
대한민국
$ echo "e1 84 83 e1 85 a2 e1 84 92 e1 85 a1 e1 86 ab e1 84 86 e1 85 b5 e1 86 ab e1 84 80 e1 85 ae e1 86 a8 0a" | xxd -r -p
대한민국

Screenshots below are the comparison of three terminal softwares, Terminal.app (bundled with macOS 10.12.6), iTerm 2 (latest beta), and Hyper (1.3.3.1754).

Terminal.app
iTerm
Hyper

Besides the incorrect rendering(why it got worse?), Terminal and iTerm handles it perfect, while Hyper does not.

If you have problem with font, you can use D2Coding, a monospace font supports Korean Hangul, Japanese Hiragana and Katakana, and CJK Ideographs.

@Snack-X
Copy link
Author

Snack-X commented Aug 16, 2017

Looks like rendering issue is related with #1535, and corresponding PR is made as #2000.

@Snack-X
Copy link
Author

Snack-X commented Sep 25, 2017

After almost a year, I can confirm this issue is fixed with latest 2.0.3.

Everything is rendered as expected, no normalization issue. That took long.

You can close this PR if you wish.

@albinekb
Copy link
Contributor

We needed to change from hterm to xterm, that's why it took so long @Snack-X

Thanks for your PR though! ❤️

@albinekb albinekb closed this Sep 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants