Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix emoji width #1951

Closed
wants to merge 1 commit into from
Closed

Fix emoji width #1951

wants to merge 1 commit into from

Conversation

minonl
Copy link

@minonl minonl commented Feb 25, 2019

Problem

As described in #469 (comment), emoji width is 1 when added then displayed in half, and delete width is 2 when it is deleted which caused character in the left is deleted by mistake.

Conditions

  1. Most emoji are 2 in cell width like😂, whose unicode is \u55357+\u56834
  2. Some symbol emoji like ❌ is single unicode \u10060.
  3. The monstrous zero code joiner 👩‍❤️‍💋‍👩 is \u{200D}, so'👩‍❤️‍💋‍👩'.length === 11, when copy & paste in Chrome address, it becomes 👩‍ ❤️‍ 💋‍ 👩 separated by \u8205

This only fix emoji width, IME support is WIP here #1890

Solution

Single 'string' width is calculated by wcwidth which doesn't consider emojis.
By extending getStringCellWidth, it can calculate both single 'string' and normal string width, containing either wcwidth standing characters and unicode emojis. Terminal should use getStringCellWidth other than wcwidth to cover emoji width calculation.

References

https://mathiasbynens.be/notes/javascript-unicode
https://blog.jonnew.com/posts/poo-dot-length-equals-two

Testing

  • Normal double-width emoji as 😂
  • Normal single-width emoji as ❌
  • Emoji with zero code joiner as 👩‍❤️‍💋‍👩, which is 👩‍ ❤️‍ 💋‍ 👩
  • Cell width break Buffer stringIndexToBufferIndex fullwidth combining with emoji - match emoji cell. 'Lots of ¥\u0301 make me 😃.' is 3 lines in terminal.buffer.lines.get(some_line_index) like:
Lots of ¥́
 make me 
😃.

@minonl minonl mentioned this pull request Feb 25, 2019
@@ -76,6 +76,20 @@ describe('getStringCellWidth', function(): void {
assert.equal(getStringCellWidth(s), sumWidths(terminal.buffer, 0, 1, '#'));
});
// TODO: multiline tests once #1685 is resolved
it('emojis', function(): void {
const input = '😁😁😁😁😁😁#';
Copy link
Contributor

@epicfaace epicfaace Feb 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen emojis in code before haha this is great

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen emojis in code before haha this is great

Actually there were a few and the failed test is one of them😂

@Tyriar
Copy link
Member

Tyriar commented Mar 4, 2019

I think this is trying to solve the same problem as #1714? @jerch

@minonl
Copy link
Author

minonl commented Mar 5, 2019

I think this is trying to solve the same problem as #1714? @jerch
This PL will not solve the root cause which is the unicode support.
Additional calculation is involved to get real displaying width and the result varies on different OS.

@jerch
Copy link
Member

jerch commented Mar 7, 2019

Yes thats part of #1714. Unless we properly respect different Unicode versions we cannot make this work reliably. I have to vote against this patch here, as it only covers some chars while others are left out. Furthermore this will break with systems, that still use a lower Unicode version, like Ubuntu 16 or older OSX versions (before High Sierra).

@Tyriar
Copy link
Member

Tyriar commented Mar 8, 2019

@jerch ah k that's what I suspected, @minvedacat thanks for the PR but unfortunately this needs to wait for #1714 to fix without regressing elsewhere (which is blocked on #1128).

@Tyriar Tyriar closed this Mar 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants