-
-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regional Indicators (Flags) and Grapheme Clusters #28
Comments
Here's my own implementation of the "string width" function which takes grapheme clusters into account: It's based on the assumption that the width of a grapheme cluster is the width of the first non-zero-width rune. That's just my guess but it works fine for a bunch of examples I tried manually. Maybe you want to use this implementation in your package. I think it would definitively improve the calculation of a string width. You could then also get rid of the special zero-width-joiner handling as it's all implicit in the |
Could you please send me PR? |
Hi! I'm not sure if this issues related but assume they are.
same for
I hope this additional info will help.
|
What is your $LANG? |
LANG=en_US.UTF-8 |
@joshuarubin 0x2194 in emoji is correctly? |
@mattn here's what I found out: |
it seems like UPD
|
Fixed StringWidth() implementation by using proper Unicode grapheme cluster segmentation. Fixes #28
Here's a short example that illustrates an issue with flags (or "regional indicators"):
The flag consists of two code points which are processed separately by
runewidth
. But most modern systems will combine them into one flag emoji.This is part of a larger topic which I describe in more detail here: gdamore/tcell#264. It doesn't just affect flags but also characters in e.g. Arabic and Korean where there are more sophisticated rules than "combining characters" and zero-width joiners (which you added with #20).
I don't know exactly how you calculate the widths of characters. I'm also not sure how you would solve flags as well as some of the other rules described in the Unicode specification but it would sure be nice as printing these flags currently gives me trouble in
tview
. There have been multiple issues asking for better support for different languages and emojis so it seems that there are quite a few people who use the terminal with these characters.(Maybe my new package
uniseg
can help you here.)The text was updated successfully, but these errors were encountered: